DWCC v2.29 The data.world catalog collector now supports Tableau Online! Additionally there was a bugfix for PowerBi.
DWCC v2.28 Bugfix release
DWCC v 2.27 Added the optional CLI option
tableau-graphql-page-size to the Tableau collector which allows the user to set a number of objects to be included in each page of paginated queries.
DWCC v2.26 Updated the PowerBi collector so that if a report is unavailable via the API it will be logged, and cataloging will continue on the rest of the repository.
DWCC v2.25 This release includes better and more user-friendly error handling and reporting. We have also added an enhanced collection of Tableau metadata via the Tableau Metadata API (graphql endpoint). New metadata includes datasources, databases, fields, metrics, and many more inter-object relationships.
DWCC v2.24 DWCC is now distributed via Dockerhub Additionally there are changes to the Tableau and PowerBI collectors, and the ability to change the level of error messages written to the console and log file, and a new subcommand to display the DWCC license text.
The Tableau collector now emits RDF in which the object of `dct:creator` is a `dwec:Agent` instead of a string literal. This means we write additional details about the Tableau account that created the dashboard, via properties of the `dwec:Agent` resource. These details include: account name, account “full name”, and account email address (if they are populated in Tableau).
The PowerBI collector writes resources representing powerbi “data sources” that are now of a PowerBI-specific class, rather than `dwec:DataArtifact`.
It is now possible for users to set the level (severity) of log messages written to the console and log file. By default, we write “info” level messages; users can choose to write only errors (level=“ERROR”), errors+warnings (level=“WARN”), or all messages including debug trace (level=“DEBUG”). This is useful if we want to have customers run DWCC with debug logging turned on, for troubleshooting problems etc.
Display DWCC license information:
License information for DWCC is now available as a subcommand of DWCC. To get all licensing information, run the command
docker run -it --rm datadotworld/dwcc:X.XX display-licensewhere X.XX is a version of DWCC greater than or equal to 2.24.
DWCC v2.23 Internal release
DWCC v2.22 Internal release
DWCC v2.21 fixed some timeout issues with Looker collector when fetching images from the Looker API. Fixed an issue with cataloging reports and dashboards based on user workspace permissions in PowerBi.
DWCC v2.20 With this release our Tableau collector now supports cataloging of workbooks and non-dashboard views as well as harvesting tags on workbooks and views. FIxed an issue in the Looker collector where preview images returned from looker api were missing.
DWCC v2.19 Includes a clean-up of the embedded help commands for several collectors and:
Fixes an issue with the Tableau Server collector when cataloging multi-site server instances.
--tableau-siteparameter to enable user to restrict cataloging to a single site (not required, by default all sites in the instance are scanned). Value provided to
--tableau-sitecan be a site ID or name.
DWCC v2.18 The tableau collector now has a flag option --tableau-skip-images which skips the harvesting of preview images for views. Usage is like this:
... catalog-tableau --tableau-api-base-url=http://ec2-44-192-86-11.compute-1.amazonaws.com/api/3.10/ --tableau-username=admin --tableau-password=password -a sc-test3 -n tableau-test --tableau-skip-images
DWCC v2.17 Adds a collector for Presto
DWCC v2.16 This release:
Adds the parameter
--all-databasesto the Athena collector so that it can catalog all the databases accessible from the logged-in account.
Fixes some issues with datatypes for
DWCC v2.15 This release contains the following:
The Tableau collector formerly had a CLI parameter
--tableau-project-idwhich could be used to catalog only assets in the project with the specified ID. The parameter is now
--tableau-projectand takes either a project ID or project name
Update to the MANTA collector to accommodate a minor change in the MANTA API with v 1.31. Customers who have updated their MANTA instance to v 1.31+ will want to use DWCC 2.15+.
The Looker collector now works for non-admin Looker users; however, when DWCC is run by a non-admin, the emitted catalog will not contain any information about databases used by Looker analysis assets (access to database information in Looker requires admin permissions).
All JDBC collectors now populate two new properties for
dwec:columnIsNullable, which contain the default value for that column in newly inserted rows, and whether the column can be null, respectively. (Note that only some databases/drivers provide this metadata…we put it in the catalog if it’s there).
DWCC v2.14 Adds a collector for Looker. Minor update to the docker-save.sh script that includes available versions in the error message if you don’t supply a version.
DWCC v2.13 Adds cli params with this version so it now possible to pass arbitrary driver properties through to the connection
DWCC v2.12 Adds collector for SAP (formerly Sybase) SQL Anywhere metadata collector
DWCC v2.11 Improves the Dremio collector’s handling of data sources nested within multiple layers of folders, and fixed a minor issue with the Dremio collector’s harvesting of lineage metadata from the Dremio graph API.
DWCC v2.10 Adds a collector for Domo and JDBC database collectors can now catalog all schemas in the database at once (default remains to catalog only user's default schema).
DWCC v2.9 Adds Tableau Server collector and extended the OpenAPI collector to include a few additional schema property metadata properties.
DWCC v2.8 Adds Infor ION data lake collector. Optimized collection of JDBC metadata (performance improvement).
DWCC v2.7 Adds a collector for PowerBI.
DWCC v2.6 Adds the Manta collector.
DWCC v2.5 Upgrads Java runtime.
DWCC v2.4 Extends handling of OpenAPI collector parameters and responses.
DWCC v2.3 Adds support for OpenAPI (fka Swagger) collector.
DWCC v2.2 A refactoring release.
DWCC v2.1 Fixes an issue with the Denodo cataloger jdbc url port.
DWCC v2.0 We now use v2 URIs as the official locator IDs for metadata resources. This is a breaking change (for structural, intentional reasons) which is not backwards compatible with v1 URIs. For more information see the article on DWCC v2.X.
DWCC v 1.20 Addresses some memory issues and open-cursor leaks.
DWCC v.1.19 Adds writing statements to the catalog graph indicating that the catalog was DWCC by DWCC (with a version). We also added the ability to write database schema objects to the catalog graph.
DWCC v1.18, Allows you to specify alternate organization permissions and upload locations when performing an automatic upload of the metadata.
DWCC v.1.16 and DWCC v.1.17 Address issues with the SQL Server cataloger.
DWCC v.1.15 Adds Dremio support with optional Catalog API lineage fetching.
DWCC v1.14, Enables you to change the amount of memory that gets allocated to a DWCC docker process. See our article on allocating additional memory to Docker for more information.
DWCC v.1.13 Adds support for Microsoft SQL Server, and we enable JVM to use available memory in the container (useful for creating large catalogs). Additionnally we Improve data type recognition in AWS Glue cataloger.
As of DWCC v1.12 we can support not only Glue ETL jobs, but also Glue Data Catalog tables and columns.
With DWCC v.1.11 you can:
Upload generated catalogs via the --upload / -U command-line parameters
Upload the DWCC log when uploading generated catalogs with --upload
Fetch an organization's current catalog with the fetch-catalog command
In DWCC v1.10 we added support for AWS Glue and AWS Athena including cataloging ETL jobs associated with an AWS account. There is no need to mount in a jdbc drivers directory as the Glue cataloger uses the Glue API, not JDBC.
dwc v.1.9 is a bug cleanup release.
It is now possible with DWCC.1.8 to use jdbc drivers on classpath as well as those found in user-specified JDBC Driver Directory (drivers in directory have higher precendence than classpath drivers).
DWCC v.1.7 is a bug-fix release
DWCC v.1.6 adds the support for arbitrary jdbc data sources and the ability to build one-off docker images for testing, demos, etc.,
With DWCC v.1.5 we add support for Oracle.
In DWCC.1.4 we add support for Google BigQuery.
DWCC v.1.3 brings much new functionality including:
Support for Denodo and Snowflake
Compatibility of JDBC catalogs with tables imported through data.world integrations
Ability to differentiate source information for databases cataloged from localhost
REMARKSfields into dct:descriptio
With DWCC v.1.2 we support Redshift databases.
DWCC v.1.1 contains documentation clarification and expansion for the documents to streamline tags on customer docker hosts.
The initial release of DWCC v.1.0 provides support for metadata catalog extraction for DB2, Hive, MySQL, Postgres.