https://slatedb.io logo
Join Discord
Powered by
# general
  • s

    shikhar

    09/21/2025, 1:15 PM
    https://graft.rs/
  • Hey folks — I was going to open a
    a

    airhorns

    09/21/2025, 1:39 PM
    Hey folks — I was going to open a feature request ticket for an atomic compare_and_swap function on the store, but figured I could check my thinking here first. Am I right in saying that SlateDB’s position on how transactions are being implement means that an atomic compare_and_swap setter function isn’t really possible? By atomic, i mean one logical, high performance operation that can be high concurrency and interleaved with many other executions. But I think in the optimistic locking world, the implementation of compare_and_swap is the same as a transaction that reads the key, does the compare, then writes the key, and checks for conflicts at the end, so it’s not really possible to make it one lower level fancier operation that relies on some in memory lock manager or something right?
    r
    c
    f
    • 4
    • 14
  • @almog I'll trade you code reviews for
    c

    criccomini

    09/22/2025, 9:33 PM
    @almog I'll trade you code reviews for https://github.com/slatedb/slatedb/pull/909 😄
    a
    • 2
    • 7
  • c

    criccomini

    09/23/2025, 9:09 PM
    Copy code
    2025-09-23T21:08:48.074723Z  INFO slatedb::mem_table_flush: memtable flush thread exiting [result=Ok(())]    
    2025-09-23T21:08:48.074732Z  INFO slatedb::mem_table_flush: notifying in-memory memtable of shutdown [result=Ok(())]    
    2025-09-23T21:08:48.074909Z  INFO slatedb::db: compactor task exited [result=Ok(())]    
    2025-09-23T21:08:48.074919Z  INFO slatedb::db: write task exited [result=Ok(())]    
    2025-09-23T21:08:48.074948Z  INFO slatedb::db: wal buffer task exited [result=Ok(())]    
    2025-09-23T21:08:48.074956Z  INFO slatedb::db: mem table flush task exited [result=Ok(())]    
    2025-09-23T21:08:48.074961Z  INFO slatedb::db: db closed
    It's beautiful.. 🥹
  • c

    criccomini

    09/23/2025, 9:10 PM
    Removing SlateDBError::BackgroundTaskShutdown
  • c

    criccomini

    09/23/2025, 9:33 PM
    https://github.com/slatedb/slatedb/pull/918
  • c

    criccomini

    09/23/2025, 9:33 PM
    😄
  • c

    criccomini

    09/23/2025, 9:34 PM
    Next will be https://github.com/slatedb/slatedb/issues/889 😄
  • c

    criccomini

    09/23/2025, 9:34 PM
    Need to rework the errors.rs file too
  • I'm also looking at the Windows failure
    c

    criccomini

    09/23/2025, 9:37 PM
    I'm also looking at the Windows failure on main... probably some temp dir issue or something silly
    r
    • 2
    • 19
  • High-performance storage innovations for...
    a

    airhorns

    09/24/2025, 9:16 PM
    has anyone messed about with google cloud rapid storage for the WAL? > Rapid Storage: A new Cloud Storage zonal bucket that provides industry-leading <1ms random read and write latency, 20x faster data access, 6 TB/s of throughput, and 5x lower latency for random reads and writes compared to other leading hyperscalers. its not GA yet but my read of it is that it is approximately just a raw API to colossus that offers muuuch better performance with similar durability as normal GCS. it is however zonal, not regional, so not nearly as many nines of availability, but still many 9s of durability and easy compute / storage separation public info i've seen is here: https://cloud.google.com/blog/products/storage-data-transfer/high-performance-storage-innovations-for-ai-hpc and here: https://cloud.google.com/blog/products/storage-data-transfer/how-the-colossus-stateful-protocol-benefits-rapid-storage
    c
    • 2
    • 8
  • c

    criccomini

    09/25/2025, 4:49 PM
    Okay.. today is flaky test day 😛
  • c

    criccomini

    09/25/2025, 4:50 PM
    Time to dig in
  • c

    criccomini

    09/25/2025, 5:37 PM
    Gone one 😎 https://discord.com/channels/1232385660460204122/1420162288618442823/1420825486119931944
  • c

    criccomini

    09/25/2025, 6:09 PM
    Got another 🔨 https://github.com/slatedb/slatedb/pull/922
  • c

    criccomini

    09/25/2025, 9:24 PM
    > can you write me a bash script that generates CPU load across all my cores? Ideally, I'd like it to vary so it goes up and down every so often This little script is exposing all kinds of fun flaky tests lol
  • c

    criccomini

    09/25/2025, 9:24 PM
    Ran
    test_should_tombstones_in_l0
    more than 7000 times without failure on my laptop. Ran it with the script and it failed in < 10 iterations lol
  • c

    criccomini

    09/26/2025, 12:25 AM
    Got another 💥 https://github.com/slatedb/slatedb/pull/923
  • c

    criccomini

    09/26/2025, 10:45 PM
    @flaneur233 IIRC, we discussed defaulting reads to dirty=true? I was looking at our code but it appears we have ReadOptions::default() use #[derive(Default)], which sets
    dirty
    to false. Is this something you were planning on changing with the txn work?
  • c

    criccomini

    09/26/2025, 11:23 PM
    Ah, nm, I got
    dirty
    and
    DurabilityFilter
    confused. 😛
  • f

    flaneur233

    09/27/2025, 2:46 AM
    sorry I might need to look up some references to refresh my memory on this part. these details are a bit tough to wrap my head around at times. 😂
  • f

    flaneur233

    09/27/2025, 2:52 AM
    if I remember correctly, what we wanted was for users to have read-your-write consistency. having durability_filter default to Memory allows users to read data they recently wrote but that hasn't been flushed yet. yeah, I think having this as the default for durability_filter serves this purpose. 🤯
  • f

    flaneur233

    09/27/2025, 2:57 AM
    on the other hand,
    dirty: true
    allows users to read uncommitted data from other transactions, which is usually unrelated to "read-your-write consistency" imo.
  • c

    criccomini

    09/27/2025, 5:55 AM
    Ya I got mixed up.
  • Been pulling at the errors.rs thread a
    c

    criccomini

    09/29/2025, 4:46 PM
    Been pulling at the errors.rs thread a bit over the weekend. Some random thoughts/observations: 1. It seems idiomatic in Rust to have a ton of error types (especially for our internal errors). So I guess I'm OK with the 40+ types we have. 2. I'm struggling with where to put the transient/retry concept. I feel like it's a property of an error rather than a top-level error type itself. Then the question is: should the retry property be exposed to end-users, be used only for internal (SlateDBError) errors, or both? 3. Background tasks kind of act like users: for retriable errors, they should retry rather than set
    state.error
    , which basically totally halts the DB. 4. We handle backpressure internally right now by blocking the put() calls and waiting internally (maybe_apply_backpressure). This feels right to me (from a user perspective), but it further muddies the retry concept since one could imagine throwing a "Backoff" error that is retryable (transient). The ergonomics around that are uglier, though. Or perhaps we want to support both? 🤔 5. I'm starting to think we should use the
    state.error
    variable only for permanent errors (i.e. those that require the DB to be closed, data to be repaired, etc). This raises the question about how we get non-fatal errors to users. We probably just want to throw the error once (i.e. mutable take it out of the variable). 6.
    Clone
    on the SlateDBError is really annoying (it makes the code more complex). But since we are currently returning the
    state.error
    on every user-facing API call, it has to be clone. I am thinking we might be able to get around this somehow, but not sure the best approach . (this feels like it ties in with 5)
    p
    r
    r
    • 4
    • 75
  • I've been thinking about how to support
    s

    Sujeet

    09/30/2025, 4:13 PM
    I've been thinking about how to support custom validation rules for different compaction scheduling strategies. This would also be useful for the compaction validations we plan to add with compaction persistence workflow. To address this, I suggest we introduce a
    validate_compaction
    method to the scheduler's interface. This would serve as a dedicated hook for compaction schedulers to verify compaction scheduled by them. Here are a couple of implementation ideas: 1. Provide a default implementation within the current compaction_scheduler trait. This would be easy to adopt and wouldn't require changes to existing compaction schedulers.
    Copy code
    Rust
    // Default implementation returns Ok
    pub trait CompactionScheduler: Send + Sync {
        fn maybe_schedule_compaction(&self, state: &CompactorState) -> Vec<Compaction>;
        fn validate_compaction(...) -> Result<(), SlateDBError> { Ok(()) }
    }
    2. Use a separate CompactionValidator trait. This would cleanly separate the validation concern from scheduling logic. Thoughts on the above☝️ . Curious to hear other perspectives.
    r
    • 2
    • 7
  • Regarding compaction scheduling, I was
    p

    Pierre

    09/30/2025, 4:21 PM
    Regarding compaction scheduling, I was exploring that the other day but it seems that strategies that can meaningfully be implemented are pretty limited due to basically no insights about what's inside the ssts? (If I didn't miss anything)
    r
    c
    • 3
    • 5
  • p

    Pierre

    09/30/2025, 4:22 PM
    I don't think this can be easily solved though
  • s

    Sujeet

    09/30/2025, 4:30 PM
    Yes..The dbState would only hold the metadata. However, the compaction validation currently seems spread across many places. I was thinking of having it tied to compactionScheduler since it would know how it had scheduled the compactions. This would be a single place to validate the scheduled compaction
  • p

    Pierre

    09/30/2025, 4:52 PM
    Interesting: https://github.com/Barre/ZeroFS/issues/177 That's probably related to SeaweedFs, but surprising nonetheless, I wonder how they handle preconditions