magnificent-lock-58916
02/14/2023, 10:46 AMlively-jackal-83760
02/22/2023, 9:59 AMtable_path = None
if project and datasource_name:
table_path = (
f"{project.replace('/', REPLACE_SLASH_CHAR)}/{datasource_name}"
)
is it possible to change this behaviour somehow?
I want to ingest vertica source as usual and then ingest Tableau and make links between ingested vertica tables and Tableau chartsorange-intern-2172
03/13/2023, 10:03 AMacoustic-quill-54426
03/13/2023, 4:11 PMacoustic-quill-54426
03/13/2023, 5:22 PMValidation error of type FieldUndefined: Field 'projectLuid' in type 'Workbook' is undefined @ 'workbooksConnection/nodes/projectLuid'
that affects all versions previous to 2022.3 https://help.tableau.com/current/api/metadata_api/en-us/docs/meta_api_release_notes.htmllively-jackal-83760
05/24/2023, 11:11 AMhappy-belgium-57206
06/06/2023, 2:08 PMminiature-painter-94073
07/10/2023, 1:10 PMnumerous-address-22061
07/13/2023, 12:50 AMnumerous-address-22061
07/15/2023, 12:32 AManalytics.analytics.table1
and it cant figure out how to connect that Custom SQL to the actual snowflake Table which is database1.analytics.table1
. How can I help it along here? Id rather it not map upstream at all than just guess and generate a dataset that sits in datahub. Ideally id get it to map back to its actual snowflake table (which is already in datahub).fast-xylophone-28117
08/02/2023, 7:28 PM"tableau-login": [
"Unable to login (check your Tableau connection and credentials): HTTPSConnectionPool(host='172.25.160.82', port=443): Max retries exceeded with url: /api/2.4/auth/signin (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)')))"
With the same user, We can log in from UI fine.
We also confirmed firewall is not blocking anything, the curl into the Tableau server IP address runs well from the datahub action pod.
On the other hand, when I use exact same recipe on back end and run cli based ingestion manually, I get pass this error and get something else (some charts, tags, and projects ingested while some failed with this error.
{
"error": "Unable to emit metadata to DataHub GMS: java.lang.RuntimeException: Unknown aspect browsePathsV2 for entity container",
"info": {
"exceptionClass": "com.linkedin.restli.server.RestLiServiceException",
"message": "java.lang.RuntimeException: Unknown aspect browsePathsV2 for entity container",
"status": 500,
"id": "urn:li:container:00eafb6262a384f1fc4e9582f576ba3d"
}
}
numerous-address-22061
08/09/2023, 4:10 PMnumerous-address-22061
08/17/2023, 5:13 PM[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - [2023-08-17, 05:06:24 PDT] ERROR {datahub.entrypoints:199} - Command failed: 'NoneType' object has no attribute 'get'
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - Traceback (most recent call last):
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/entrypoints.py", line 186, in main
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - sys.exit(datahub(standalone_mode=False, **kwargs))
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - return self.main(*args, **kwargs)
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - rv = self.invoke(ctx)
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - return _process_result(sub_ctx.command.invoke(sub_ctx))
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - return _process_result(sub_ctx.command.invoke(sub_ctx))
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - return ctx.invoke(self.callback, **ctx.params)
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - return __callback(*args, **kwargs)
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - return f(get_current_context(), *args, **kwargs)
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 448, in wrapper
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - raise e
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 397, in wrapper
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - res = func(*args, **kwargs)
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/utilities/memory_leak_detector.py", line 95, in wrapper
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - return func(ctx, *args, **kwargs)
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 198, in run
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - ret = loop.run_until_complete(run_ingestion_and_check_upgrade())
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - return future.result()
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 182, in run_ingestion_and_check_upgrade
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - ret = await ingestion_future
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 140, in run_pipeline_to_completion
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - raise e
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 132, in run_pipeline_to_completion
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - pipeline.run()
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 367, in run
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - for wu in itertools.islice(
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 119, in auto_stale_entity_removal
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - for wu in stream:
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 143, in auto_workunit_reporter
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - for wu in stream:
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 208, in auto_browse_path_v2
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - for urn, batch in _batch_workunits_by_urn(stream):
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 346, in _batch_workunits_by_urn
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - for wu in stream:
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 156, in auto_materialize_referenced_tags
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - for wu in stream:
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 70, in auto_status_aspect
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - for wu in stream:
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py", line 2590, in get_workunits_internal
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - yield from self.emit_sheets()
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py", line 2028, in emit_sheets
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - yield from self.emit_sheets_as_charts(
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py", line 2107, in emit_sheets_as_charts
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - project_luid: Optional[str] = self._get_workbook_project_luid(workbook)
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py", line 1438, in _get_workbook_project_luid
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - if wb.get(tableau_constant.LUID) and self.workbook_project_map.get(
[2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - AttributeError: 'NoneType' object has no attribute 'get
numerous-address-22061
08/17/2023, 5:14 PMworkbook_project_map
not being set to a value?brainy-musician-50192
08/22/2023, 8:54 AMextract_column_level_lineage (boolean, default: true):
When enabled, extracts column-level lineage from Tableau Datasources
Does this mean lineage between tableau data source and tableau chart, and not between external table/view and tableau data source? If so, are there any future plans to add column lineage between Snowflake and Tableau?strong-author-11562
08/31/2023, 9:45 PMworried-solstice-95319
09/14/2023, 1:07 AMmagnificent-lock-58916
09/19/2023, 10:33 AMmelodic-account-56198
09/21/2023, 4:34 AMquiet-kangaroo-60946
10/04/2023, 5:20 PMquiet-arm-91745
10/05/2023, 12:11 PMsource:
type: tableau
config:
stateful_ingestion:
enabled: true
connect_uri: 'tableau site'
ingest_tags: true
ingest_owner: true
site: ""
token_value: 'token'
token_name: datahub
extract_column_level_lineage: true
extract_lineage_from_unsupported_custom_sql_queries: true
extract_usage_stats: true
ingest_embed_url: true
ingest_tables_external: true
page_size: 1
sink:
type: datahub-rest
config:
server: '<http://datahub-datahub-gms.datahub.svc.cluster.local:8080>'
max_threads: 1
thanks in advancebulky-shoe-65107
10/16/2023, 12:42 AMdamp-computer-35583
01/18/2024, 5:42 PMacryl-datahub, version 0.12.1.3
via the quickstart docker setup based on the 0.12.0 version, and I have tried to extract the column level lineage from the custom sql queries. But I don't seem to be getting any of that metadata back. Here is my yaml.
source:
type: tableau
config:
connect_uri: '<https://tableau.example.com>'
stateful_ingestion:
enabled: false
username: 'user'
password: 'password'
ingest_owner: true
extract_lineage_from_unsupported_custom_sql_queries: true
ingest_tags: true
extract_usage_stats: true
ingest_tables_external: true
projects:
- 'sales
damp-computer-35583
01/18/2024, 9:39 PMdamp-computer-35583
01/18/2024, 9:41 PMdamp-computer-35583
01/18/2024, 9:41 PMdamp-computer-35583
01/18/2024, 9:42 PMdamp-computer-35583
01/18/2024, 9:43 PMable-artist-56392
02/08/2024, 10:00 AMMessage: 'Failed to commit changes for DatahubIngestionCheckpointingProvider.'
When disabling stateful_ingestion parameter, the ingestion works successfully, but another problem appears.
dashboards are not deleted from datahub if they are not anymore present in Tableau.
I see this parameter remove_stale_metadata that allow to remove deleted dahsboard but it is linked to the stateful_ingestion.
how can I solve it?
second issue is :
When trying to make a full ingestion of our Tableau cloud site, ingestion fails
with this warning "embeddedDatasourcesConnection": [
"[{'locations': None, 'message': 'Showing partial results. The request exceeded the 20000 node limit. Use pagination, additional filtering, or both in the query to adjust results.', 'errorType': None, 'extensions': {'severity': 'WARNING', 'code': 'NODE_LIMIT_EXCEEDED', 'properties': {'nodeLimit': 20000}}, 'path': None}]",
and theses error rising up and appears in the beguining of the ingestion log file
{
"error": "Unable to emit metadata to DataHub GMS",
"info": {
"message": "502 Server Error: Bad Gateway for url: <https://datahub-gms>..../aspects?action=ingestProposal",
"id": "urn:li:chart:(tableau,...)"
}
},
{
"error": "Unable to emit metadata to DataHub GMS",
"info": {
"message": "403 Client Error: Forbidden for url: <https://datahub-gms>...../entities?action=ingest",
"id": "urn:li:chart:(tableau,...)"
}
}
(urn have been masked as the url of my datahub site)
I don't have the exact time needed for ingestion but it takes at least 45 minuts.
What can be the causes for theses errors and what actions can I do to try to correct theses errors?handsome-planet-77266
02/28/2024, 6:47 PM