StarRocks #announcements

Beryl Chen

09/10/2025, 3:00 PM

Summit goes live in 1 hour — 9 AM PT / 12 PM ET / 6 PM CEST / 9:30 PM IST danceml Grab a coffee ☕ and join the ride with us! <!channel> 👉 See the full agenda + grab your free pass if you haven’t already: [link]

Summit promotion linkedin post.mov

👍 2

Andy Ye

09/12/2025, 9:34 PM

Hi folks, we hosted our very first StarRocks Global Summit. It was a successful event, with engineers from well-known companies such as Intuit, Celonis, Fanatics, and Pinterest sharing their use cases with StarRocks. The content was excellent, and the feedback from the audience was very positive. Many thanks to the colleagues who helped organize the summit, the speakers, and the audience from around the world. We also recorded a community voice session from our open-source users. You’re welcome to like and share it! 🎉 https://www.linkedin.com/posts/starrocks-oss_starrockssummit2025-dataanalytics-dataen[…]m=member_desktop&rcm=ACoAAAEXiCEBMiHpzHjFkEm-Y43MqOWlMShBEAo

💚 10

starrocks 3

Ronit Kapoor

09/19/2025, 9:44 PM

📢 New Video + Feedback Wanted! Hey <!channel>! Ron again - I just published a new beginner-friendly quickstart tutorial for those new to StarRocks: ▶️ StarRocks in Docker –

https://www.youtube.com/watch?v=h7F4U6xEA5M&t=1s▾

I’d also love your input on what kind of StarRocks content you’d like to see next. It only takes a minute to share your thoughts here: 📝 Feedback Form - https://forms.gle/VwmrMMrD7QVxv7Mg9 Your guys' feedback will help me decide what tutorials, blogs, and docs to make in the future, so I appreciate the support! Thanks for being part of the StarRocks community!

👀 2

👍 4

yay 3

rocky heart 3

🙌 1

GitHub

09/22/2025, 1:26 AM

Release - 3.5.6 New release published by yingtingdong Release date: September 22, 2025 ### Improvements • A decommissioned BE will be forcibly dropped when all its tablets are in the recycle bin, to avoid the decommission being blocked by those tablets. #62781 • Vacuum metrics will be updated when Vacuum succeeds. #62540 • Added thread pool metrics to the fragment instance execution state report, including active threads, queue count, and running threads. #63067 • Supports S3 path-style access in shared-data clusters to improve compatibility with MinIO and other S3-compatible storage systems. You can enable this feature by setting

aws.s3.enable_path_style_access

true

when creating a storage volume. #62591 • Supports resetting the starting point of the AUTO_INCREMENT value via

Copy code

ALTER TABLE`` <table_name>`` AUTO_INCREMENT`` = 10000;

. #62767 • Supports using Distinguished Name (DN) in Group Provider for group matching, improving the user group solution for LDAP/Microsoft Active Directory environments. #62711 • Supports Azure Workload Identity authentication for Azure Data Lake Storage Gen2. #62754 • Added transaction error messages to the

Copy code

information_schema.``loads

view to aid failure diagnosis. #61364 • Supports reusing common expressions for complex CASE WHEN expressions in Scan predicates to reduce repetitive computation. #62779 • Uses the REFRESH (instead of ALTER) privilege on the materialized view to execute REFRESH statements. #62636 • Disabled low-cardinality optimization on Lake tables by default to avoid potential issues. #62586 • Enabled tablet balancing between workers by default in shared-data clusters. #62661 • Supports reusing expressions in outer-join WHERE predicates to reduce repetitive computation. #62139 • Added Clone metrics in FE. #62421 • Added Clone metrics in BE. #62479 • Added an FE configuration item

enable_statistic_cache_refresh_after_write

to disable statistics-cache lazy refresh by default. #62518 • Masked credential information in SUBMIT TASK for better security. #62311 •

json_extract

in the Trino dialect returns a JSON type. #59718 • Supports ARRAY type in

null_or_empty

. #62207 • Adjusted the size limit for the Iceberg manifest cache. #61966 • Added a remote file-cache limit for Hive. #62288 ### Bug Fixes The following issues have been fixed: • Secondary replicas hang indefinitely due to negative timeout values, which cause incorrect timestamp comparisons. #62805 • PublishTask may be blocked when TransactionState is REPLICATION. #61664 • Incorrect repair mechanism for Hive tables that have been dropped and recreated during materialized view refresh. #63072 • Incorrect execution plans were generated after the materialized view aggregation push‑down rewrite. #63060 • ANALYZE PROFILE failures caused by PlanTuningGuide producing unrecognized strings (null explainString) in the query profiles. #63024 • Inappropriate return type of

hour_from_unixtime

and incorrect rewrite rule of

CAST

. #63006 • NPE in Iceberg manifest cache under data races. #63043 • Shared-data clusters lack support for colocation in materialized views. #62941 • Iceberg table Scan Exception during Scan Range deployment. #62994 • Incorrect execution plans were generated for view-based rewrite. #62918 • Errors and disrupted tasks due to Compute Nodes are not gracefully shut down on exit. #62916 • NPE when Stream Load execution status updates. #62921 • An issue with statistics when the column name and the name in the PARTITION BY clause differ in case. #62953 • Wrong results are returned when the

LEAST

function is used as a predicate. #62826 • Invalid ProjectOperator above the table-pruning frontier CTEConsumer. #62914 • Redundant replica handling after Clone. #62542 • Failed to collect Stream Load profiles. #62802 • Ineffective disk rebalancing caused by improper BE selection. #62776 • A potential NPE crash in LocalTabletsChannel when a missing

tablet_id

leads to a null delta writer. #62861 • KILL ANALYZE does not take effect. #62842 • SQL syntax errors in histogram stats when MCV values contain single quotes. #62853 • Incorrect output format of metrics for Prometheus. #62742 • NPE when querying

information_schema.analyze_status

after the database is dropped. #62796 • CVE-2025-58056. #62801 • When SHOW CREATE ROUTINE LOAD is executed, wrong results are returned because the database is considered null if not specified. #62745 • Data loss caused by incorrectly skipping CSV headers in

files()

. #62719 • NPE when replaying batch-transaction upserts. #62715 • Publish being incorrectly reported as successful during graceful shutdown in shared-nothing clusters. #62417 • Crash in asynchronous delta writer due to a null pointer. #62626 • Materialized view refresh is skipped because the materialized view version map is not cleared after a failed restore job. #62634 • Issues caused by case-sensitive partition column validation in the materialized view analyzer. #62598 • Duplicate IDs for statements with syntax errors. #62258 • StatisticsExecutor status is overridden due to redundant state assignment in CancelableAnalyzeTask. #62538 • Incorrect e… StarRocks/starrocks

👍 5

🎉 6

GitHub

09/29/2025, 7:32 AM

Release - 4.0.0-RC02 New release published by wangsimo0 ## 4.0.0-RC02 Release Date: September 29, 2025 ### New Features • Supports setting sort keys when creating Iceberg tables. • Supports Multi-Table Write-Write Transactions, allowing users to atomically commit

INSERT

UPDATE

, and

DELETE

operations. These transactions are compatible with both Stream Load and

INSERT INTO

interfaces, ensuring cross-table consistency in ETL and real-time ingestion scenarios. • Supports modifying aggregation keys of aggregate tables. ### Improvements • Optimized Delta Lake Catalog cache configuration: adjusted default values of

DELTA_LAKE_JSON_META_CACHE_TTL

and

DELTA_LAKE_CHECKPOINT_META_CACHE_TTL

to 24 hours, and simplified Parquet handler logic. #63441 • Improved Delta Lake Catalog error log format and content for better debugging. #63389 • External groups (e.g., LDAP Group) now support role grant/revoke and display, improving SQL syntax and test coverage for stronger access control. #63385 • Strengthened Stream Load parameter consistency checks to reduce risks caused by parameter drift. #63347 • Optimized Stream Load label passing mechanism to reduce dependencies. #63334 • Improved

ANALYZE PROFILE

format: ExplainAnalyzer now supports grouping metrics by operator. #63326 • Enhanced

QueryDetailActionV2

and

QueryProfileActionV2

APIs to return results in JSON format. #63235 • Improved predicate parsing in scenarios with large numbers of CompoundPredicates. #63139 • Adjusted certain FE metrics to be leader-aware. #63004 • Enhanced

SHOW PROCESS LIST

with Catalog and Query ID information. #62552 • Improved BE JVM memory monitoring metrics. #62210 • Optimized materialized view rewrite logic and log outputs. #62985 • Optimized random bucketing strategy. #63168 • Supports resetting

AUTO_INCREMENT

start value with

ALTER TABLE <table_name> AUTO_INCREMENT = 10000;

. #62767 • Group Provider now supports matching groups by DN. #62711 ### Bug Fixes The following issues have been fixed: • Incomplete

Left Join

results caused by ARRAY low-cardinality optimization. #63419 • Incorrect execution plan generated after materialized view aggregate pushdown rewrite. #63060 • Redundant warning logs printed in JSON field pruning scenarios when schema fields were not found. #63414 • Infinite loop caused by SIMD Batch parameter errors when inserting DECIMAL256 data in ARM environments. #63406 • Three storage-related issues: #63398 • Cache exception when disk path is empty. • Incorrect Azure cache key prefix. • S3 multipart upload failure. • ZoneMap filter invalidation after CHAR-to-VARCHAR schema change with Fast Schema Evolution. #63377 • ARRAY aggregation type analysis error caused by intermediate type

ARRAY<NULL_TYPE>

. #63371 • Metadata inconsistency in partial updates based on auto-increment columns. #63370 • Metadata inconsistency when deleting tablets or querying concurrently. #63291 • Failure to create

spill

directory during Iceberg table writes. #63278 • Ranger Hive Service permission changes not taking effect. #63251 • Group Provider did not support

IF NOT EXISTS

and

IF EXISTS

clauses. #63248 • Errors caused by using reserved keywords in Iceberg partitions. #63243 • Prometheus metric format issue. #62742 • Version check failure when starting replication transactions with Compaction enabled. #62663 • Missing Compaction Profile when File Bunding was enabled. #62638 • Issues handling redundant replicas after Clone. #62542 • Delta Lake tables failed to find partition columns. #62953 • Materialized views did not support Colocation in shared-data clusters. #62941 • Issues reading NULL partitions in Iceberg tables. #62934 • SQL syntax error caused by single quotes in Histogram statistics MCV (Most Common Values). #62853 •

KILL ANALYZE

command not working. #62842 • Failure collecting Stream Load profiles. #62802 • Incorrect CTE reuse plan extraction. #62784 • Rebalance failure due to incorrect BE selection. #62776 •

User Property

priority is lower than

Session Variable

. #63173 StarRocks/starrocks

🎉 8

GitHub

09/30/2025, 10:18 AM

Release - 3.4.8 New release published by jaogoy Release Date: September 30, 2025 ## Behavior Change • Lake internal tablet parallel scan (enable_lake_tablet_internal_parallel) is now enabled by default, increasing per‑query internal parallelism (may raise peak resource usage) #62360 ## Bug Fixes The following issues have been fixed: ### Data Lake Analytics • Delta Lake partition column names were forcibly converted to lowercase, causing mismatch with actual column names #62970 • Iceberg manifest cache eviction race could trigger a NullPointerException #63052 • Uncaught generic exceptions during Iceberg scan phase interrupted scan range submission and produced no metrics #63019 ### Materialized Views (MV) • Complex multi-layer projected views used in MV rewrite produced invalid plans or missing column statistics #63014 #62230 • Case mismatch of Hive external table MV partition columns was incorrectly rejected #62623 • MV refresh used only the creator’s default role, causing insufficient privilege in “no default role” or LDAP setups (role activation strategy & config introduced) #62461 • Case-insensitive conflicts in list-partitioned MV partition names led to duplicate name errors #62443 • Residual version mapping after failed MV restore caused subsequent incremental refresh to be skipped, returning empty results #62643 • Abnormal partitions after MV recovery caused FE restart NullPointerException #62563 • Non-global aggregation queries incorrectly applied aggregation pushdown rewrite producing invalid plans #63105 ### Storage / Metadata • Tablet deletion state was only updated in memory (shutdown) and not persisted, so GC still treated it as running and skipped reclamation #63623 #63620 • Concurrent query plus drop tablet led to early delvec cleanup and “no delete vector found” errors #63307 • Base and cumulative sstable sharing the same max_rss_rowid in PK index compaction were misordered, risking lost delete semantics #63362 • Possible BE crash when LakePersistentIndex destructor ran after a failed initialization #62297 • Graceful shutdown of publish thread pool silently discarded queued tasks without marking failures, creating version holes and a false “all succeeded” impression #62683 • Newly cloned replica on a newly added BE during rebalance was immediately judged redundant and removed, preventing data migration to the new node #62894 • Missing lock when reading tablet max version caused inconsistent replication transaction decisions #62280 ### Query & Optimization • Combination of date_trunc equality and raw column range predicate was reduced to a point interval, returning empty result sets (e.g. date_trunc('month', dt)='2025-09-01' AND dt>'2025-09-23') #63570 • Pushdown of non-deterministic predicates (random/time functions) produced inconsistent results #63533 • Missing consumer node after CTE reuse decision produced incomplete execution plans #63188 • Type mismatch crashes when table functions and low-cardinality (dictionary) encoding coexisted #62500 #62384 ### Ingestion & Export • Oversized CSV split into parallel fragments caused every fragment to skip header rows, leading to data loss (only the first fragment should skip) #62789 • SHOW CREATE ROUTINE LOAD without explicit DB returned job from another database with the same name #62792 • NullPointerException when sameLabelJobs became null during concurrent load job cleanup #63181 ### Cluster Operations & Management • CN normal restart or crash path incorrectly executed scale-in deregistration, harming topology consistency #63002 #63010 • Backend decommission blocked even when all tablets were already in recycle bin (no force completion) #63267 • OPTIMIZE TABLE task stuck in PENDING after thread pool rejection #62556 • Dirty tablet metadata cleanup used GTID arguments in the wrong order #62285 StarRocks/starrocks

Beryl Chen

10/09/2025, 4:03 AM

<!channel> Friendly reminder: Today’s session, “Real-Time Analytics for Web3: Fraud Detection, Trading, and Growth at Scale,” starts at 11:30 AM IST | 1:00 PM ICT (VN/TH/Jakarta) | 2:00 PM SGT/CST/HKT/PHT/TWT | 3:00 PM KST/JST! 👉 Save your seat here: https://celerdata.wistia.com/live/events/60tbj98da5?utm_campaign=slack Join Sida Shen to explore the unique requirements of real-time analytics for Web3—why legacy stacks can’t keep up, and how companies like Coinbase are overcoming those limits. Discover how to achieve second-level data freshness (<10 s latency without batch jobs), run sub-second queries even under heavy concurrency, and query Iceberg directly without ETL overhead. The timing is perfect for participants in Asia, but the insights are global. Can’t join live? Register anyway, and we’ll send you the recording. danceml

🙌 3

party 8

GitHub

10/14/2025, 7:13 AM

Release - 3.3.19 New release published by wangsimo0 ## 3.3.19 Release Date: October 14, 2025 ### Bug Fixes The following issues have been fixed: •

UserProperty

had lower priority than Session Variables. #63173 • Materialized view refresh failures that could occur when the Hive base table was dropped and recreated. #63072 • Issues with the aggregation pushdown rewrite rule. #63060 • Inconsistencies between null columns and data columns in Boolean extraction functions for JSON. #63054 • Issues when getting partition columns in Delta Lake format tables. #62953 • Lack of colocation support for materialized views in shared-data clusters. #62941 • Projection mapping errors in view-based materialized view rewrite. #62918 • SQL syntax errors in histogram statistics when Most Common Values (MCV) contained single quotes. #62853 •

KILL ANALYZE

did not work. #62842 • CVE-2025-58056 vulnerability. #62801 • Executing

SHOW CREATE ROUTINE LOAD

without specifying a database causes wrong results. #62745 • Data loss caused by incorrectly skipping CSV headers in

files()

. #62719 • Version check failures when Replication and Compaction transactions were committed together. #62663 • Materialized view refresh is skipped because the materialized view version map is not cleared after a failed restore job. #62634 • Issues caused by case-sensitive partition column validation in the materialized view analyzer. #62598 StarRocks/starrocks

🎉 2

GitHub

10/17/2025, 8:53 AM

Release - 4.0.0 New release published by wangsimo0 # StarRocks version 4.0 ## 4.0.0 Release date: October 17, 2025 ### Data Lake Analytics • Unified Page Cache and Data Cache for BE metadata, and adopted an adaptive strategy for scaling. #61640 • Optimized metadata file parsing for Iceberg statistics to avoid repetitive parsing. #59955 • Optimized COUNT/MIN/MAX queries against Iceberg metadata by efficiently skipping over data file scans, significantly improving aggregation query performance on large partitioned tables and reducing resource consumption. #60385 • Supports compaction for Iceberg tables via procedure

rewrite_data_files

. • Supports Iceberg tables with hidden partitions, including creating, writing, and reading the tables. #58914 • Supports setting sort keys when creating Iceberg tables. • Optimizes sink performance for Iceberg tables. • Iceberg Sink supports spilling large operators, global shuffle, and local sorting to optimize memory usage and address small file issues. #61963 • Iceberg Sink optimizes local sorting based on Spill Partition Writer to improve write efficiency. #62096 • Iceberg Sink supports global shuffle for partitions to further reduce small files. #62123 • Enhanced bucket-aware execution for Iceberg tables to improve concurrency and distribution capabilities of bucketed tables. #61756 • Supports the TIME data type in the Paimon catalog. #58292 • Upgraded Iceberg version to 1.10.0. #63667 ### Security and Authentication • In scenarios where JWT authentication and the Iceberg REST Catalog are used, StarRocks supports the passthrough of user login information to Iceberg via the REST Session Catalog for subsequent data access authentication. #59611 #58850 • Supports vended credentials for the Iceberg catalog. • Supports granting StarRocks internal roles to external groups obtained via Group Provider. #63385 #63258 • Added REFRESH privilege to external tables to control the permission to refresh them. #63385 ### Storage Optimization and Cluster Management • Introduced the File Bundling optimization for the cloud-native table in shared-data clusters to automatically bundle the data files generated by loading, Compaction, or Publish operations, thereby reducing the API cost caused by high-frequency access to the external storage system. #58316 • Supports Multi-Table Write-Write Transaction to allow users to control the atomic submission of INSERT, UPDATE, and DELETE operations. The transaction supports Stream Load and INSERT INTO interfaces, effectively guaranteeing cross-table consistency in ETL and real-time write scenarios. #61362 • Supports Kafka 4.0 for Routine Load. • Supports full-text inverted indexes on Primary Key tables in shared-nothing clusters. • Supports modifying aggregate keys of Aggregate tables. #62253 • Supports enabling case-insensitive processing on names of catalogs, databases, tables, views, and materialized views. #61136 • Supports blacklisting Compute Nodes in shared-data clusters. #60830 • Supports global connection ID. #57256 • Added the

recyclebin_catalogs

metadata view to Information Schema to display recoverable deleted metadata. #51007 ### Query and Performance Improvement • Supports DECIMAL256 data type, expanding the upper limit of precision from 38 to 76 bits. Its 256-bit storage provides better adaptability to high-precision financial and scientific computing scenarios, effectively mitigating DECIMAL128's precision overflow problem in very large aggregations and high-order operations. #59645 • Improved the performance for basic operators.#61691 #61632 #62585 #61405 #61429 • Optimized the performance of the JOIN and AGG operators. #61691 • [Preview] Introduced SQL Plan Manager to allow users to bind a query plan to a query, thereby preventing the query plan from changing due to system state changes (mainly data updates and statistics updates), thus stabilizing query performance. #56310 • Introduced Partition-wise Spillable Aggregate/Distinct operators to replace the original Spill implementation based on sorted aggregation, significantly improving aggregation performance and reducing read/write overhead in complex and high-cardinality GROUP BY scenarios. #60216 • Flat JSON V2: • Supports configuring Flat JSON on the table level. #57379 • Enhance JSON columnar storage by retaining the V1 mechanism while adding page- and segment-level indexes (ZoneMaps, Bloom filters), predicate pushdown with late materialization, dictionary encoding, and integration of a low-cardinality global dictionary to significantly boost execution efficiency. #60953 • Supports an adaptive ZoneMap index creation strategy for the STRING data type. #61960 • Enhanced query observability: • Optimized EXPLAIN ANALYZE output to display the execution metrics by group and by operator for better readability. #63326 •

QueryDetailActionV2

and

QueryProfileActionV2

now support JSON format, enhancing cross-FE query capabilities. #63235 • Supports retrieving Query Profile information across all FEs. #61345 • SHOW PROCESSLIST statements display Catalog, Query ID, and other information. #62552 • Enhanced query queue and process monitoring, supporting display of Running/Pending statuses.#62261 • Materialized view rewrites consider the distribution and sort keys of the original table, improving the selection of optimal materialized views. #62830 ### Functions and SQL Syntax • Added the following functions: •

bitmap_hash64

#56913 •

bool_or

#57414 •

strpos

#57278 •

to_datetime

and

to_datetime_ntz

#60637 •

regexp_count

#57182 •

tokenize

<https://github.com/Star… StarRocks/starrocks

👍 12

starrocks 11

👍🏼 1

Beryl Chen

10/17/2025, 7:30 PM

🎉 We did it — The StarRocks Community just crossed 5,000 members! <!channel> You’ve built, tested, benchmarked, shared, and shaped StarRocks into what it is today — a fast, open-source engine driving real-time analytics at scale. Let’s call it #5KFriday rocky confused💚 Thank you for being part of the StarRocks movement! rocky heart

❤️ 5

😍 3

👍 4

🤩 3

🥳 3

starrocks 8

party 15

🙌 23

clapclap 5

Paul O'Brien

10/20/2025, 6:04 PM

🚀 StarRocks 4.0 is live! Open • Fast • Governed Highlights: - ~60% faster YoY with deep optimizations for JOINs, aggregations, and spill handling. - First-class Apache Iceberg: faster metadata parsing, hidden-partition reads/writes, a new compaction API, and optimized file writes. - JSON as a first-class type (Flat JSON V2): 3–15× faster queries without flattening. - Real-time at lower cost: file bundling, metadata caching, and smarter compaction cut cloud API calls by up to 90%. - Lakehouse governance: catalog-centric access control on Apache Iceberg (JWT passthrough + vended credentials). - Expanded workloads: Decimal256 for high-precision, large-scale aggregations; multi-statement transactions for financial and multi-stage pipelines; ASOF JOIN for time-series and AI use cases. - Operational improvements: node blacklisting, case-insensitive names, and global connection IDs. 📖 Launch blog: https://www.starrocks.io/blog/starrocks-4.0-now-available 📚 Docs: https://docs.starrocks.io/releasenotes/release-4.0/ Join the full walkthrough + live Q&A with @Sida Shen and @Ronit Kapoor this Thursday, Oct 23: https://celerdata.wistia.com/live/events/iot8atqjzj?utm_campaign=slack

🚀 18

💪 9

starrocks 21

party 10

rocky nice 1

GitHub

10/21/2025, 8:27 AM

Release - 3.5.7 New release published by yingtingdong ### Improvements • Improved memory statistics accuracy for Scan operators by introducing retry backoff under heavy memory contention scenarios. #63788 • Optimized materialized view bucketing inference by leveraging existing tablet distribution to prevent excessive bucket creation. #63367 • Revised the Iceberg table caching mechanism to enhance consistency and reduce cache invalidation risks during frequent metadata updates. #63388 • Added the

querySource

field to

QueryDetail

and

AuditEvent

for better traceability of query origins across APIs and schedulers. #63480 • Enhanced Persistent Index diagnostics by printing detailed context when duplicate keys are detected in MemTable writes. #63560 • Reduced lock contention in materialized view operations by refining lock granularity and sequencing in concurrent scenarios. #63481 ### Bug Fixes The following issues have been fixed: • Materialized view rewrite failures caused by type mismatch. #63659 •

regexp_extract_all

has wrong behavior and lacks support for

pos=0

. #63626 • Degraded scan performance caused by the profitless simplification of CASE WHEN with complex functions. #63732 • Incorrect DCG data reading when partial updates switch from column mode to row mode. #61529 • A potential deadlock during initialization of

ExceptionStackContext

. #63776 • Crashes in Parquet numeric conversion for ARM architecture machines. #63294 • An issue caused by the aggregate intermediate type uses

ARRAY<NULL_TYPE>

. #63371 • Stability issue caused by incorrect overflow detection when casting LARGEINT to DECIMAL128 at sign-edge cases (for example, INT128_MIN) #63559 • LZ4 compression and decompression errors cannot be perceived. #63629 •

ClassCastException

when querying tables partitioned by

FROM_UNIXTIME

on INT-type columns. #63684 • Tablets cannot be repaired after a balance-triggered migration when the only valid source replica is marked

DECOMMISSION

. #62942 • Profiles lost SQL statements and Planner Trace when the PREPARE statement is used. #63519 • The

extract_number

extract_bool

, and

extract_string

functions are not exception-safe. #63575 • Shutdown tablets cannot be garbage-collected properly. #63595 • Profiles showing SQL as

omit

for returns of the PREPARE/EXECUTE statements. #62988 •

date_trunc

partition pruning with combined predicates that mistakenly produced EMPTYSET. #63464 • Crashes in release builds due to the CHECK in NullableColumn. #63553 StarRocks/starrocks

Beryl Chen

10/21/2025, 9:06 PM

Wanted to share the latest performance benchmark! yay StarRocks 4.0 delivers a 60% year-over-year performance improvement, thanks to deeper query optimizations, faster JOINs, smarter aggregations, and steady performance across JSON & Iceberg workloads. Take a look at the full breakdown → https://celerdata.com/blog/starrocks-4.0-zero-compromise-60-faster

🚀 6

party 3

💪 5

starrocks 4

danceml 3

❤️ 2

Beryl Chen

10/22/2025, 10:54 PM

iceberg For the Iceberg users/fans here: StarRocks 4.0 enables catalog-centric access control to Iceberg so security & governance live in the Iceberg REST Session Catalog—not the engine: • JWT identity passthrough: queries run as the real user • Centralized authorization & auditing: one policy plane across engines • Vended short-lived credentials: least-privilege, no engine superusers 👉Get the full details + a step-by-step setup guide: https://celerdata.com/blog/starrocks-4.0-catalog-centric-access-control-on-apache-iceberg

🚀 4

starrocks 4

💪 3

😍 4

🙌 4

Beryl Chen

10/23/2025, 1:45 PM

Friendly reminder: Today’s session, "Announcing StarRocks 4.0," is happening at 10 AM PT | 1 PM ET! yay Save your seat here: https://celerdata.wistia.com/live/events/iot8atqjzj <!channel> Join Sida and Ron for a deep dive into what’s new and improved in StarRocks 4.0 — and get your questions answered live. They’ll cover: • 60% faster performance with smarter JOINs, aggregations, and spill control • Deeper Apache Iceberg integration with hidden partitions, faster metadata, and a compaction API • Up to 15× faster native JSON analytics • Lower real-time costs through file bundling and caching • Catalog-centric governance with unified authentication • Extended workload support for finance, time-series, and AI analytics Can’t make it? Register anyway, and we’ll send you the recording.

👍 9

starrocks 6

❤️ 16

Beryl Chen

10/24/2025, 9:31 PM

What a week! 🎉 As StarRocks 4.0 launch week wraps up, we hope you’ve gotten to know our latest version a little better. If you missed anything, here’s a quick bundle: • 🔖 Launch blog: https://www.starrocks.io/blog/starrocks-4.0-now-available • 📝 Release Note: https://docs.starrocks.io/releasenotes/release-4.0/ • 🎥 Webinar recording:

https://youtu.be/3U1Y8CpJk50▾

• 🔖 Performance Benchmarks: https://celerdata.com/blog/starrocks-4.0-zero-compromise-60-faster • 🔖 Catalog-Centric Access Control on Apache Iceberg: https://celerdata.com/blog/starrocks-4.0-catalog-centric-access-control-on-apache-iceberg • 🔖 Delivering Query-Ready Data to Apache Iceberg: https://celerdata.com/blog/starrocks-4.0-delivering-query-ready-data-to-apache-iceberg • 🔖 Native Columnar Performance to JSON: https://celerdata.com/blog/starrocks-4.0-bringing-native-columnar-performance-to-json • 🔖 Multi-Statement Transactions in Practice: https://celerdata.com/blog/when-etl-jobs-fail-halfway-the-hidden-cost-of-non-atomic-data-operations <!channel>

dogdance 6

starrocks 4

🚀 7

👍 19

Beryl Chen

10/30/2025, 1:45 PM

Friendly reminder: Today’s session — “Locking Down the Lakehouse: Securing Apache Iceberg’s REST Catalog at Scale” — starts at 10 AM PT | 1 PM ET! yay Save your seat here: https://celerdata.wistia.com/live/events/hm7o4k79nn <!channel> Ron will guide you through Iceberg’s evolving security framework—covering OAuth2, enterprise identity integration, and new StarRocks 4.0 capabilities that enhance REST Catalog governance with catalog-level access control, JWT authentication, and temporary cloud-credential vending. Bring your questions—we’ll answer them live! Can’t make it? Register anyway, and we’ll send the full recording.

starrocks 8

👀 2

👍 3

clapclap 5

Beryl Chen

10/31/2025, 12:17 AM

Hi <!channel>, sharing a few open roles we’re hiring for at CelerData: If you enjoy building large-scale distributed systems, working on cloud platforms, or tackling complex database and security challenges, this is a great opportunity to have a direct impact on core systems used by global enterprises. Openings: 🔹 Senior Software Engineer, Database Systems (US Remote): https://lnkd.in/gN4j8-4m 🔹 Software Engineer, Cloud Platform (Bay Area): https://lnkd.in/gACMU9r2 🔹 Lead Security Engineer, Cloud Platform (US Remote): https://lnkd.in/gmcUxWhN Join us and make an impact at a fast-growing startup shaping the future of analytics. Feel free to reach out to @Luna Zeng (luna.zeng@celerdata.com) if you’d like to connect before applying — and if you know someone who might be a great fit, please feel free to forward these along too. Thanks! 💙

👍 5

dogdance 4

Beryl Chen

10/31/2025, 8:51 PM

We just rolled out our first CelerData/StarRocks monthly newsletter — your go-to roundup for feature updates, optimization tips, case studies, roadmap insights, and more. We’re starting the month strong with major updates — including the launch of StarRocks 4.0, our most advanced release to date, along with fresh tutorials, Summit recap, community highlights, partnership news, and upcoming events and meetups. 👉🏻 Check it out here: https://www.linkedin.com/pulse/october-2025-highlights-starrocks-40-launch-summit-rewind-more-lrfdc <!channel> Feel free to check it out, subscribe, and share with others!

gratitude thank you 4

🚀 7

❤️ 8

party 4

GitHub

11/10/2025, 3:28 AM

Release - 3.5.8 New release published by yingtingdong Release date: November 10, 2025 ### Improvements • Upgraded Arrow to 19.0.1 to support the Parquet legacy list to include nested, complex files. #64238 • FILES() supports legacy Parquet LIST encodings. #64160 • Automatically determine the Partial Update mode based on the session variable and the number of inserted columns. #62091 • Applied low-cardinality optimization on analytic operators above table functions. #63378 • Added configurable table lock timeout to

finishTransaction

to avoid blocking. #63981 • Shared-data clusters support table-level scan metrics attribution. #62832 • Window functions LEAD/LAG/FIRST_VALUE/LAST_VALUE now accept ARRAY type arguments. #63547 • Supports constant folding for several array functions to improve predicate pushdown and join simplification. #63692 • Supports batched API to optimize

tabletNum

retrieval for a given node via

SHOW PROC /backends/{id}

. Added an FE configuration item

enable_collect_tablet_num_in_show_proc_backend_disk_path

(Default:

true

). #64013 • Ensured

INSERT ... SELECT

reads the freshest metadata by refreshing external tables before planning. #64026 • Added

capacity_limit_reached

checks to table functions, NL-join probe, and hash-join probe to avoid constructing overflowing columns. #64009 • Added FE configuration item

collect_stats_io_tasks_per_connector_operator

(Default:

) for setting the maximum number of tasks to collect statistics for external tables. #64016 • Updated the default partition size for sample collection from 1000 to 300. #64022 • Increased lock table slots to 256 and added

rid

to slow-lock logs. #63945 • Improved robustness of Gson deserialization in the presence of legacy data. #63555 • Reduced metadata lock scope for FILES() schema pushdown to cut lock contention and planning latency. #63796 • Added Task Run execute timeout checker by introducing an FE configuration item

task_runs_timeout_second

, and refined cancellation logics for overdue runs. #63842 • Ensured

REFRESH MATERIALIZED VIEW ... FORCE

always refreshes target partitions (even in inconsistent or corrupted cases). #63844 ### Bug Fixes The following issues have been fixed: • An exception when parsing the Nullable (Decimal) type of ClickHouse. #64195 • An issue with tablet migration and Primary Key index lookup concurrency. #64164 • Lack of FINISHED status in materialized view refresh. #64191 • Schema Change Publish does not retry in shared-data clusters. #64093 • Wrong row count statistics on Primary Key tables in Data Lake. #64007 • When tablet creation times out in shared-data clusters, node information cannot be returned. #63963 • Corrupted Lake DataCache cannot be cleared. #63182 • Window function with IGNORE NULLS flags can not be consolidated with its counterpart without iIGNORE NULLS flag. #63958 • Table compaction cannot be scheduled again after FE restart if the compaction was previously aborted. #63881 • Tasks fail to be scheduled if FE restarts frequently. #63966 • An issue with GCS error codes. #64066 • Instability issue with StarMgr gRPC executor. #63828 • Deadlock when creating an exclusive work group. #63893 • Cache for Iceberg tables is not properly invalidated. #63971 • Wrong results for sorted aggregation in shared-data clusters. #63849 • ASAN error in

PartitionedSpillerWriter::_remove_partition

. #63903 • BE crash when failing to get splits from morsel queue. #62753 • A bug with aggregate push-down type cast in materialized view rewrite. #63875 • NPE when removing expired load jobs in FE. #63820 • Partitioned Spill crash when removing partitions. #63825 • Materialized view rewrite throws

IllegalStateException

under certain plans. #63655 • NPE when creating a partitioned materialized view. #63830 StarRocks/starrocks

👀 1

🙌 2

Kate Shao-Community Manager

11/13/2025, 2:41 AM

Hi guys, @Yoav Nordmann just published a deep-dive comparison of StarRocks vs. ClickHouse — and the results are pretty wild.

“I put StarRocks and ClickHouse head-to-head on a 3B-row, 1TB+ dataset — running vendor-supported AWS clusters at 300 QPS concurrency.”

Spoiler: StarRocks crushed it. 💪 ✅ 40%+ lower latency on long-range queries ✅ Higher QPS with fewer nodes (ClickHouse left CPU on the table) ✅ OLTP-style updates — no deltas, no merges, just works ✅ Zero errors (while ClickHouse hit HTTP timeouts) ✅ Native Apache Iceberg write support 🔥 It wasn’t plug-and-play though — tuning was key. 👉 Dive into the full story here: https://medium.com/israeli-tech-radar/starrocks-a-database-too-fast-for-its-own-good-eb86954fc7ea

master 2

dogdance 5

👍 3

starrocks 2

GitHub

11/17/2025, 3:37 AM

Release - 4.0.1 New release published by wangsimo0 ## 4.0.1 Release Date: November 17, 2025 ### Improvements • Optimized TaskRun session variable handling to process known variables only. #64150 • Supports collecting statistics of Iceberg and Delta Lake tables from metadata by default. #64140 • Supports collecting statistics of Iceberg tables with bucket and truncate partition transform. #64122 • Supports inspecting FE

/proc

profile for debugging. #63954 • Enhanced OAuth2 and JWT authentication support for Iceberg REST catalogs. #63882 • Improved bundle tablet metadata validation and recovery handling. #63949 • Improved scan-range memory estimation logic. #64158 ### Bug Fixes The following issues have been fixed: • Transaction logs were deleted when publishing bundle tablets. #64030 • The join algorithm cannot guarantee the sort property because, after joining, the sort property is not reset. #64086 • Issues related to transparent materialized view rewrite. #63962 ### Behavior Changes • Added the property

enable_iceberg_table_cache

to Iceberg Catalogs to optionally disable Iceberg table cache and allow it always to read the latest data. #64082 • Ensured

INSERT ... SELECT

reads the freshest metadata by refreshing external tables before planning. #64026 • Increased lock table slots to 256 and added

rid

to slow-lock logs. #63945 • Temporarily disabled

shared_scan

due to incompatibility with event-based scheduling. #63543 • Changed the default Hive Catalog cache TTL to 24 hours and removed unused parameters. #63459 • Automatically determine the Partial Update mode based on the session variable and the number of inserted columns. #62091 StarRocks/starrocks

🙌 1

Kate Shao-Community Manager

11/17/2025, 3:55 PM

Hey everyone! 👋 <!channel> In the spirit of gratitude, we’ve officially opened nominations for the StarRocks Awards 2025! If you know someone who has made a meaningful impact on the community — through code, content, meetups, Q&A support, or advocacy — we’d love to hear from you. Self-nominations are welcome too! rocky nice 🗓️ Nomination period: Nov 17–29 (Results to be announced in early Dec.) 📝 Nomination form: https://docs.google.com/forms/d/e/1FAIpQLSfhs-i2W4V95JSafNWBQJA0K4w86chXkRUjbAr1nSxTYPHpsg/viewform?usp=sharing&ouid=106630649310039560524 Let’s give thanks to the people who lift our community! clapclap

star 1 2

starrocks 2

🙌 4

rock star 2

GitHub

11/18/2025, 3:51 AM

Release - 3.3.20 New release published by wangsimo0 ## 3.3.20 Release Date: November 18, 2025 ### Bug Fixes The following issues have been fixed: • CVE-2024-47561. #64193 • CVE-2025-59419. #64142 • Incorrect row count for lake Primary Key tables. #64007 • Window function with IGNORE NULLS flags can not be consolidated with its counterpart without IGNORE NULLS flag. #63958 • ASAN error in

PartitionedSpillerWriter::_remove_partition

. #63903 • Wrong results for sorted aggregation in shared-data clusters. #63849 • NPE when creating a partitioned materialized view. #63830 • Partitioned Spill crash when removing partitions. #63825 • NPE when removing expired load jobs in FE. #63820 • A potential deadlock during initialization of

ExceptionStackContext

. #63776 • Degraded scan performance caused by the profitless simplification of CASE WHEN with complex functions. #63732 • Materialized view rewrite failures caused by type mismatch. #63659 • Materialized view rewrite throws

IllegalStateException

under certain plans. #63655 • LZ4 compression and decompression errors cannot be perceived. #63629 • Stability issue caused by incorrect overflow detection when casting LARGEINT to DECIMAL128 at sign-edge cases (for example, INT128_MIN) #63559 •

date_trunc

partition pruning with combined predicates that mistakenly produced EMPTYSET. #63464 • Incomplete

Left Join

results caused by ARRAY low-cardinality optimization. #63419 • An issue caused by the aggregate intermediate type uses

ARRAY<NULL_TYPE>

. #63371 • Metadata inconsistency in partial updates based on auto-increment columns. #63370 • Incompatible Bitmap index reuse for Fast Schema Evolution in shared-data clusters. #63315 • Unnecessary CN deregistration during pod restart/upgrade. #63085 • Profiles showing SQL as

omit

for returns of the PREPARE/EXECUTE statements. #62988 StarRocks/starrocks

Kate Shao-Community Manager

11/19/2025, 7:42 PM

Hi guys, 👋 Recently, we’ve published some awesome use cases from StarRocks Summit 2025! Check out how companies are scaling real-time analytics with StarRocks: starrocks Intuit: Sub-4s real-time analytics at 100K+ events/sec https://celerdata.com/blog/how-intuit-achieved-sub-4-second-real-time-analytics-at-100k-events-per-second starrocks SplitMetrics: Replaced PostgreSQL for customer-facing analytics https://celerdata.com/blog/how-splitmetrics-replaced-postgresql-with-starrocks-for-customer-facing-analytics starrocks Demandbase: Built a scalable lakehouse with Iceberg + StarRocks https://celerdata.com/blog/how-demandbase-built-a-scalable-data-lakehouse-with-starrocks-and-iceberg starrocks Coinbase: Powering blockchain analytics at scale https://celerdata.com/blog/how-coinbase-powers-blockchain-analytics-at-scale-with-starrocks Feel free to reshare or pass them along! rocky heart

database parrot 3

clapclap 4

Kate Shao-Community Manager

11/21/2025, 12:13 AM

📚 New Use Case: Cisco Webex × StarRocks With millions of global users relying on Webex every day, Cisco needed faster, more reliable real-time analytics than its fragmented Pinot + Trino stack could deliver. To break through the limits of joins, stability, and operational complexity, Cisco turned to StarRocks. The impact: starrocks 50% lower query latency starrocks 70% of queries faster than Trino starrocks 10× boost on common workloads with materialized views starrocks Unified SQL + simplified operations across all analytics teams Check out the full deep dive 👉 https://www.starrocks.io/blog/how-cisco-webex-unified-real-time-analytics-with-starrocks

👍 3

party 3

GitHub

11/24/2025, 10:27 AM

Release - 3.4.9 New release published by jaogoy Release Date: November 24, 2025 ### Behavior Changes • Changed the return type of

json_extract

in the Trino dialect from STRING to JSON. This may cause incompatibility in CAST, UNNEST, and type check logic. #59718 • The metric that reports “connections per user” under

/metrics

now requires admin authentication. Without authentication, only total connection counts are exposed, preventing information leakage of all usernames via metrics. #64635 • Removed the deprecated system variable

analyze_mv

. Materialized view refresh no longer automatically triggers ANALYZE jobs, avoiding large numbers of background statistics tasks. This changes expectations for users relying on legacy behavior. #64863 • Changed the overflow detection logic of casting from LARGEINT to DECIMAL128 on x86.

INT128_MIN * 1

is no longer considered an overflow to ensure consistent casting semantics for extreme values. #63559 • Added a configurable table-level lock timeout to

finishTransaction

. If a table lock cannot be acquired within the timeout, finishing the transaction will fail for this round and be retried later, rather than blocking indefinitely. The final result is unchanged, but the lock behavior is more explicit. #63981 ### Bug Fixes The following issues have been fixed: • During BE start, if loading tablet metadata from RocksDB times out, RocksDB may restart loading from the beginning and accidentally pick up stale tablet entries, risking data version loss. #65146 • Data corruption issues related to CRC32C checksum for delete-vectors of Lake Primary Key tables. #65006 #65354 #65442 #65354 • When the internal

flat_path

string is empty because the JSON hyper extraction path is

or all paths are skipped, calling

substr

will throw an exception and cause BE crash. #65260 • When spilling large strings to disk, insufficient length checks, using 32‑bit attachment sizes, and issues in the BlockReader could cause crashes. #65373 • When multiple HTTP requests reuse the same TCP connection, if a non‑ExecuteSQL request arrives after an ExecuteSQL request, the

HttpConnectContext

cannot be unregistered at channel close, causing HTTP context leaks. #65203 • Primitive value loss issue under certain circumstances when JSON data is being flattened. #64939 #64703 • Crash in

ChunkAccumulator

when chunks are appended with incompatible JSON schemas. #64894 • In

AsyncFlushOutputStream

, asynchronous I/O tasks may attempt to access a destroyed

MemTracker

, resulting in use‑after‑free crashes. #64735 • Concurrent Compaction tasks against the same Lake Primary Key table lack integrity checks, which could leave metadata in an inconsistent state after a failed publish. #65005 • When spilling Hash Joins, if the build side’s

set_finishing

task failed, it only recorded the status in the spiller, allowing the Probe side to continue, and eventually causing a crash or an indefinite loop. #65027 • During tablet migration, if the only newest replica is marked as DECOMMISSION, the version of the target replica is outdated and stuck at VERSION_INCOMPLETE. #62942 • Use-after-free issue because the relevant Block Group is not released when

PartitionedSpillerWriter

is removing partitions. #63903 #63825 • BE crash caused by MorselQueue's failure to get splits. #62753 • In shared-data clusters, Sorted-by-key Scans with multiple I/O tasks could produce wrong results in sort-based aggregations. #63849 • On ARM, reading Parquet columns for certain Hive external tables could crash in LZ4 conversion when copying NULL bitmaps because the destination null buffer pointer was stale due to out-of-order execution. #63294 StarRocks/starrocks

✅ 3

Beryl Chen

11/25/2025, 3:15 PM

Friendly reminder: Today’s session with Yufei Gu (Snowflake), an Apache Polaris PPMC member — Building an Open, Governed Lakehouse with Apache Polaris — starts at 10 AM PT | 1 PM ET! yay Save your seat here: https://celerdata.wistia.com/live/events/7bv3gv0h3d <!channel> Yufei will dive into how Polaris bridges openness and governance—bringing the community together to build standards that reduce data fragmentation and lay the foundation for a truly interoperable, governed lakehouse. Stick around for a live Q&A with Yufei and Ron on Apache Polaris, Apache + StarRocks, and the broader lakehouse ecosystem! If you can’t join live, feel free to register — we’ll send the recording afterward.

✅ 1

👍 5

Kate Shao-Community Manager

11/26/2025, 8:05 PM

🦃 Happy Thanksgiving to all our U.S. community members! <!here> And to everyone around the world — wishing you a wonderful week, whether you’re celebrating or just enjoying some well-deserved rest. 💚 And don’t forget: StarRocks Awards nominations are still open! If there’s someone you think deserves recognition, we’d love to hear from you. ⭐ 📝 Nomination form: https://docs.google.com/forms/d/e/1FAIpQLSfhs-i2W4V95JSafNWBQJA0K4w86chXkRUjbAr1nSxTYPHpsg/viewform?usp=sharing&ouid=106630649310039560524

👍 1

🎉 2

🙌 1

❤️ 1

GitHub

11/27/2025, 2:24 AM

Release - 3.5.9 New release published by yingtingdong ### Improvements • Added transaction latency metrics to FE for observing timing across transaction stages. #64948 • Supports overwriting S3 unpartitioned Hive tables to simplify full-table rewrites in data lake scenarios. #65340 • Introduced CacheOptions to provide finer-grained control over tablet metadata caching. #65222 • Supports sample statistics collection for INSERT OVERWRITE to ensure statistics stay consistent with the latest data. #65363 • Optimized the statistics collection strategy after INSERT OVERWRITE to avoid missing or incorrect statistics due to asynchronous tablet reports. #65327 • Introduced a retention period for partitions dropped or replaced by INSERT OVERWRITE or materialized view refresh operations, keeping them in the recycle bin for a while to improve recoverability. #64779 ### Bug Fixes The following issues have been fixed: • Lock contention and concurrency issues related to

LocalMetastore.truncateTable()

. #65191 • Lock contention and replica check performance issues related to TabletChecker. #65312 • Incorrect error logging when changing user via HTTP SQL. #65371 • Checksum failures caused by DelVec CRC32 upgrade compatibility issues. #65442 • Tablet metadata load failures caused by RocksDB iteration timeout. #65146 • When the internal

flat_path

string is empty because the JSON hyper extraction path is

or all paths are skipped, calling

substr

will throw an exception and cause BE crash. #65260 • The PREPARED flag in fragment execution is not correctly set. #65423 • Inaccurate write and flush metrics caused by duplicated load profile counters. #65252 • When multiple HTTP requests reuse the same TCP connection, if a non‑ExecuteSQL request arrives after an ExecuteSQL request, the

HttpConnectContext

cannot be unregistered at channel close, causing HTTP context leaks. #65203 • MySQL 8.0 schema introspection errors (Fixed by adding session variables

default_authentication_plugin

and

authentication_policy

). #65330 • SHOW ANALYZE STATUS errors caused by unnecessary statistics collection for temporary partitions created after partition overwrite operations. #65298 • Global Runtime Filter race in the Event Scheduler. #65200 • Data Cache is aggressively disabled because the minimum Data Cache disk size constraint is too large. #64909 • An aarch64 build issue related to the

gold

linker automatic fallback. #65156 StarRocks/starrocks

👍 2

🎉 1