https://mage.ai logo
Join Slack
Powered by
# general
  • a

    ARKAPROVA SAHA

    08/17/2023, 5:17 AM
    Is anyone using mage in production? If yes, could you please share your experience here? What is the data volume and velocity of your application? Whether it is batch or streaming job? What are the issues you facing in production pipeline?
    n
    x
    • 3
    • 2
  • n

    Nikhil Garakapati

    08/17/2023, 5:28 AM
    Hello folks I want to setup a pipeline which syncs ~5mil records to my destination using Incremental and Update method. After the 1st sync, the pipeline again process the ~5mil records and updates the changes. This is causing a huge load every time on every run on my server. Then I came across CDC replication which uses bin log. How can I enable it in mage? Source: MySQL
    x
    • 2
    • 13
  • n

    Nikhil Garakapati

    08/17/2023, 7:22 AM
    Exception thrown when attempting to run <function BlockExecutor.execute.<locals>.__execute_with_retry at 0xffffa7af83a0>, attempt 1 of 1 I'm triggering multiple sources to same destination (GBQ). And I'm getting this error. Does mageai allows multiple sources to triggers to same destination?
    x
    • 2
    • 1
  • a

    Andras Gabor - Kojak

    08/17/2023, 8:11 AM
    Hi, I am trying to offer mage as part of a set of applications in our company. I am able to deploy mage on k8s. My goal is to provide mage under an url: https://<fixed-nam>/mage/ Has anyone been able to configure a path prefix ingress (nginx) for mage? I am having issues because the application refers to resources with an absolute path starting w /.
    ✅ 3
    d
    • 2
    • 5
  • o

    Octavian Stolnicu

    08/17/2023, 10:22 AM
    Hi guys, quick question regarding global variables with probably a simple answer. Is there a way to actually set global variables in the interface (global meaning for all pipelines, not just for current pipeline) without using environment variables or command line variables? Secrets seem to be proper global, not pipeline global, so I guess I could use those, but they are still set on a pipeline and can be easily deleted with a pipeline or difficult to locate when having multiple pipelines.
    x
    • 2
    • 1
  • n

    Nisha Biradar

    08/17/2023, 10:42 AM
    Hi, Could you provide me with a comprehensive list of the table formats that are supported for data export? (like parquet, delta, iceberg ,etc)
    d
    • 2
    • 3
  • m

    Michael Olawepo

    08/17/2023, 2:04 PM
    @ll Anyone have experience or know any
    opensource
    tool that I can use to downsize a production
    sqlserver db
    for testing purposes. I have a legacy Database (sqlserver) , Will like a down-sized version of this db while retaining relational constraints and source data distribution. apply synthetic data for data privacy and sharing with third parties.
    🤔 1
    d
    • 2
    • 1
  • t

    Thomas Genet

    08/17/2023, 11:06 PM
    Hi guys! Just looking at mage's architecture https://docs.mage.ai/production/deploying-to-cloud/architecture, is it possible to deploy just the executor to GCP Cloud run? I want to deploy a pipeline that is expected to run 2-3 times a week, but that is event triggered. I have a cloud function that is triggered when a file is added to a GCS bucket, and that function then sends out a pubsub message w/ parameters which will trigger Cloud Run.
    🚀 1
    x
    • 2
    • 5
  • n

    Nikhil Garakapati

    08/18/2023, 11:42 AM
    How can I change the timezone configuration in my production? And why the timestamps in my destination tables have changed to UTC after pipeline run?
    x
    • 2
    • 26
  • a

    Andras Gabor - Kojak

    08/18/2023, 12:56 PM
    Hi, I am trying to implement the following: During log in populate information that is specific to a given user and store it into the session / SessionResource.py I am able to do this. Now I would like to use some of that inside a pipeline. Similar to how you can use env variables or secrets in pipelines, I would like to use user specific values that are populated during login, these are user-specific secrets basically. Do you think this is possible? Do pipelines even have a user and session concept associated with them during runtime? @DANGerous, what do you think?
    m
    d
    • 3
    • 7
  • j

    Jon Simpson

    08/18/2023, 5:10 PM
    Can blocks declare a particular version of a dependency? Like say a package like boto or smart_open has a bug but is currently used in many places. Can I create a pipeline variant that uses a new version and run it alongside the other jobs until I feel good about it?
    x
    • 2
    • 3
  • f

    Federico Ostrit

    08/18/2023, 9:59 PM
    hello, good day, using mage's hel chart to use ExternalSecrets to add it as a variable "MAGE_DATABASE_CONNECTION_URL" and thus use the connection to an external postgres, the values ​​would not work: existingSecret: "" nor extraEnvs: is there any way to use externalSecret to reference the connection string for the external
    i
    x
    • 3
    • 33
  • m

    Marcello Dichiera

    08/19/2023, 5:01 PM
    Hi, I am new to Mage, I was wondering how can I create a pipeline to extract a dataset from Kaggle using the kaggle api and save the csv file into my local path: anomaly_detec/data? I am using mage with mage start. Thanks a lot, Marcello
    l
    • 2
    • 4
  • n

    Nisha Biradar

    08/20/2023, 4:09 PM
    Hi, What are the recommended strategies for effectively utilizing ad hoc queries within Mage?
    e
    • 2
    • 2
  • a

    Anurag Telang

    08/21/2023, 11:33 AM
    Hello, I am new to Mage. I could not find any reference on how to create a pipeline using Oracle. Can I pls get some reference links.
    d
    • 2
    • 6
  • v

    Vicente Ramirez

    08/21/2023, 1:23 PM
    Good morning everyone, I hope you have a great week ahead. I have a practical question. My use case is to replicate an on-premises MS SQL Server database into the cloud. I want to add a dbt layer in-between in order to model complex data. If my goal is to schedule this replication once-a-day or once-an-hour (hopefully with CDC), should I use a Data Integration pipeline or a ETL batch one? I tried both using BigQuery as a destination; Data Integration was more intuitive, but when the job started it was SUPER slow, data was syncing on batches of 50k lines and it took like 2 hours to send a 750k rows table to BQ. Using the ETL batch it took 3 seconds. What's the difference here? When should I use one or the other? Thanks a lot in advance.
    d
    • 2
    • 17
  • l

    Luis Moscote Diaz

    08/21/2023, 3:50 PM
    Hello Everyone, I am building a POC using Mage. Can I use projects previously created using mage and python 3.10 in another mage server running on Python 3.8? if so, what should be the steps to follow? Thanks! It could happen that we have to change our python versions so i want to check what are the potential limitations that I could face.
    x
    • 2
    • 4
  • m

    Mike mirza

    08/21/2023, 4:54 PM
    Hey everyone, I have a mage docker running on a server, It has a GitHub repository as remote and a pipeline which is scheduled to run everyday, is there an environment variable or something else that can help me config it to do a pull from remote before running the pipeline ?
    m
    x
    • 3
    • 3
  • m

    Marcello Dichiera

    08/21/2023, 6:20 PM
    Hi, sorry for the basic question, how do I run an exisiting pipeline created yesterday? I did mage start 'name of the project' but it was showing me the example pipeline only
    x
    • 2
    • 3
  • s

    Sujith Kumar.S

    08/21/2023, 7:17 PM
    Do we need to use external redis service to run the multiple schedulers?
    homer wahoo 1
    leo toast 1
    x
    • 2
    • 17
  • t

    Thomas Chung

    08/21/2023, 7:21 PM
    🎉 We're thrilled to extend a heartfelt shoutout to our community member, @Christopher Scholz! His relentless efforts and dedication have remarkably enhanced our user experience, spanning helm charts and Docker optimizations. Thank you Christopher, we're truly grateful for your unwavering passion and commitment to enriching our community! 🙏
    christopher-scholz.mp4
    🙌 5
    epic 6
    partywizard 6
    eeveexpika highfive 5
    battlecry 6
    mage logo color 6
    partycat 6
    🙏 5
    mario luigi dance 5
    🎉 7
    ❤️ 8
    s
    d
    c
    • 4
    • 4
  • t

    Thomas Genet

    08/21/2023, 7:52 PM
    Are there any examples of using the "env" variable from the environment variables?
    e
    • 2
    • 5
  • m

    Marcello Dichiera

    08/21/2023, 8:47 PM
    Sorry another newbie question. In which folder I can upload my Classes to then be used in (for example) data loader? I tried to save it in the data loader of my folder and then import "from myproject. data_loader.MyClassfile import MyClass. But doesn't work. Where am I wrong? Thanks!
    x
    • 2
    • 47
  • r

    Radek

    08/21/2023, 9:19 PM
    Hi, do you think there's a way to have the kafka extractor use gssapi (kerberos authentication) somehow?
    x
    • 2
    • 7
  • g

    Gwyn (she/her)

    08/22/2023, 3:05 AM
    Quick question: Does Mage have a light theme? Couldn't find any settings or issues mentioning this. 😛
    d
    • 2
    • 5
  • a

    Apurva Lokare

    08/22/2023, 7:03 AM
    To Fetch EMR cluster for Mage pyspark kernel, in Document there is a way to add configuration in metadata.yml so mage will spin cluster automatically. But can we setup EMR cluster manually for Mage pyspark kernel and spin the same cluster everytime we need?
    d
    • 2
    • 6
  • t

    Thomas

    08/22/2023, 8:57 AM
    I have a strategy question: I want to sync Mysql -> Redshift. Redshift is too slow when writing directly, therefore is better to use S3 bucket and a copy statement. Can I use the use • the "Integration" in one step • or Mysql to S3 Bucket and then S3 Bucket to Redshift • or Custom code?
    d
    • 2
    • 2
  • r

    Radek

    08/22/2023, 9:57 AM
    If I have to do Data Integration to liberate on-premise data to cloud for billions of rows in sql databases and 100's of TiB's in hdfs, would mage be up to the task? This would be an ongoing integration job.
    d
    • 2
    • 1
  • a

    Ankur Saikia

    08/22/2023, 10:13 AM
    I am trying to apply for job as a Data Engineer, taking up Mage.ai as my core, but as it's fairly new in the market, it doesn't have much exposure, not even in YT, except the documentation, I did manage to build a basic pipeline using SQL database, but I need to learn enough to be sure, any open projects that can be done, even for Airflow, just to get a better idea and also any site that suggests questions that are asked to a Data Engineer in an interview (including coding questions)?
    d
    • 2
    • 2
  • e

    Erik Edelmann

    08/22/2023, 3:22 PM
    Hello Mage folks! Getting set up and enjoying learning more so far. One early question: What is the recommended version control setup for cases where I want to do both development and run production pipelines in a single cloud environment? It seems like GitHub Authentication is a good option but I understand there are a few different ways to go about this and other git related features
    x
    d
    • 3
    • 9