Please correct me if I m wrong but it looks like Kedro s imp Kedro #questions

Please correct me if I’m wrong but it looks like K...

Ofir

05/05/2023, 3:16 PM

Please correct me if I’m wrong but it looks like Kedro’s implementation has slightly overlooked input dataset as a differentiating factor for an experiment. That is, Kedro doesn’t consider a different input dataset as a different session/experiment run.

Juan Luis

05/05/2023, 3:20 PM

do you mean, in the context of Kedro Viz Experiment Tracking?

Ofir

05/05/2023, 3:21 PM

Yes for example. How do you store two experiments, same code but different data, side-by-side?

Ofir

05/05/2023, 3:24 PM

This is a simple execution/flow question. The second question is of methodology, how do you compare them?

Juan Luis

05/05/2023, 3:53 PM

since the metrics and json outputs can be versioned, every time you do

kedro run

the tracking outputs you've defined will be saved in a different directory, that will be named using the timestamp of the run: https://docs.kedro.org/en/stable/visualisation/experiment_tracking.html#generate-the-run-data so the question is, how to identify which input dataset was used for each, am I right?

Ofir

05/05/2023, 3:57 PM

Yes, and how to easily specify input datasets without modifying files, e.g.

kedro run --name=my_first_exp --input-dataset=my_first_dataset.parquet

kedro run --name=my_second_exp --input-dataset=my_second_dataset.parquet

kedro viz

Ofir

05/05/2023, 3:57 PM

something along these lines..

Juan Luis

05/05/2023, 4:08 PM

modular pipelines allow you to reuse the same pipeline structure for different inputs: https://docs.kedro.org/en/stable/nodes_and_pipelines/modular_pipelines.html#how-to-use-a-modular-pipeline-with-different-parameters you could then designate different pipelines that run for different inputs already defined in your catalog. this is similar to a question that got asked a few days ago https://kedro-org.slack.com/archives/C03RKP2LW64/p1682343231008289

Juan Luis

05/05/2023, 4:37 PM

also, @Ofir you might want to use

kedro run --from-inputs

Ofir

05/06/2023, 4:33 PM

Thanks a lot!

14 Views

Open in Slack

Previous Next