Beryl Chen
07/25/2024, 1:30 PMBeryl Chen
07/26/2024, 5:05 PMGitHub
07/30/2024, 3:20 AM\t
and \n
as row and column delimiters. Users do not need to convert them to their hexadecimal ASCII codes. #47302
Bug Fixes
Fixed the following issues:
• Frequent INSERT and UPDATE operations on Primary Key tables may cause write and query delays in the database. #47838
• When a Primary Key table encounters data persistence failures, the persistent index may fail to capture the error, leading to data loss and reporting the error "Insert found duplicate key". #48045
• Materialized views may report insufficient permissions when refreshed. #47561
• Materialized view reports the error "For input string" when refreshed. #46131
• During materialized view refresh, the lock is held excessively long, causing the Leader FE to be restarted by the deadlock detection script. #48256
• Queries against views with the IN clause in its definition may return inaccurate results. #47484
• Global Runtime Filter causes incorrect results. #48496
• MySQL protocol COM_CHANGE_USER
does not support conn_attr
. #47796
Behavior Changes
• When users create a non-partitioned table without specifying the bucket number, the minimum bucket number the system sets for the table is 16
(instead of 2
based on the formula 2*BE or CN count
). If users want to set a smaller bucket number when creating a small table, they must set it explicitly. #47005
StarRocks/starrocksBeryl Chen
08/02/2024, 9:52 PMGitHub
08/08/2024, 8:14 AMauto_partition_max_creation_number_per_load
• max_partition_number_per_table
• max_bucket_number_per_partition
• max_column_number_per_table
• Supports runtime optimization of table data distribution, ensuring optimization tasks do not conflict with DML operations on the table. #43747
• Added an observability interface for the global hit rate of Data Cache. #48450
• Added the SQL function array_repeat. #47862
Improvements
• Optimized the error messages for Routine Load failures due to Kafka authentication failures. #46136 #47649
• Stream Load supports using \t
and \n
as row and column delimiters. Users do not need to convert them to their hexadecimal ASCII codes. #47302
• Optimized the asynchronous statistics collection method for write operators, addressing the issue of increased latency when there are many import tasks. #48162
• Added the following BE dynamic parameters to control resource hard limits during loading, reducing the impact on BE stability when writing a large number of tablets. #48495
Including:
• load_process_max_memory_hard_limit_ratio
• enable_new_load_on_memory_limit_exceeded
• Added consistency checks for Column IDs within the same table to prevent Compaction errors. #48498
• Supports persisting PIPE metadata to prevent metadata loss due to FE restarts. #48852
Bug Fixes
• The process could not end when creating a dictionary from an FE Follower. #47802
• Inconsistent information returned by the SHOW PARTITIONS command in shared-data clusters and shared-nothing clusters. #48647
• Data errors caused by incorrect type handling when loading data from JSON fields to ARRAY<BOOLEAN>
columns. #48387
• The query_id
column in information_schema.task_runs
cannot be queried. #48876
• During Backup, multiple requests for the same operation are submitted to different Brokers, causing request errors. #48856
• Downgrading to versions earlier than v3.1.11 or v3.2.4 causes Primary Key table index decompression failures, leading to query errors. #48659
Downgrade Notes
If you have used the renaming column feature, you must rename the columns to their original names before downgrading your cluster to an earlier version. You can check the audit log of your cluster after upgrading to identify any ALTER TABLE RENAME COLUMN
operations and the original names of the columns.
StarRocks/starrocksBeryl Chen
08/15/2024, 1:30 PMBeryl Chen
08/19/2024, 10:54 PMGitHub
08/23/2024, 6:13 AMBYTE_ARRAY
data with a logical_type
of JSON
in Parquet files to the JSON type in StarRocks. #49385
• Optimized error messages for Files() when Access Key ID and Secret Access Key are missing. #49090
• information_schema.columns
supports the GENERATION_EXPRESSION
field. #49734
Bug Fixes
Fixed the following issues:
• Downgrading a v3.3 shared-data cluster to v3.2 after setting the Primary Key table property "persistent_index_type" = "CLOUD_NATIVE"
causes a crash. #48149
• Exporting data to CSV files using SELECT INTO OUTFILE may cause data inconsistency. #48052
• Queries encounter failures during concurrent query execution. #48180
• Queries would hang due to a timeout in the Plan phase without exiting. #48405
• After disabling index compression for Primary Key tables in older versions and then upgrading to v3.2.9, accessing page_off
information causes an array out-of-bounds crash. #48230
• BE crash caused by concurrent execution of ADD/DROP COLUMN operations. #49355
• Queries against negative TINYINT
values in ORC format files return None
on the aarch64 architecture. #49517
• If the disk write operation fails, failures of l0
snapshots for Primary Key Persistent Index may cause data loss. #48045
• Partial Update in Column mode for Primary Key tables fails under scenarios with large-volume data updates. #49054
• BE crash caused by Fast Schema Evolution when downgrading a v3.3.0 shared-data cluster to v3.2.9. #42737
• partition_linve_nubmer
does not take effect. #49213
• The conflict between index persistence and compaction in Primary Key tables could cause clone failures. #49341
• Modifications of partition_line_number
using ALTER TABLE do not take effect. #49437
• Rewrite of CTE distinct grouping sets generates an invalid plan. #48765
• RPC failures polluted the thread pool. #49619
• authentication failure issues when loading files from AWS S3 via PIPE. #49837
Behavior Changes
• Added a check for the meta
directory in the FE startup script. If the directory does not exist, it will be automatically created. #48940
• Added a memory limit parameter load_process_max_memory_hard_limit_ratio
for data loading. If memory usage exceeds the limit, subsequent loading tasks will fail. #48495
StarRocks/starrocksGitHub
09/04/2024, 9:04 AMcount(*)
on certain tables returns NULL. #49288
• partition_linve_nubmer
does not take effect. #49213
• FE throws a tablet exception: BE disk offline, and cannot migrate tablets. #47833
StarRocks/starrocksGitHub
09/05/2024, 5:55 AMmax(partition_column)
. #49391
• Partition pruning is used to optimize query performance when the partition column is a generated column (a column that is calculated based on a native column in the table), and the query predicate filter condition includes the native column. #48692
• Supports masking authentication information for Files() and PIPE. #47629
• Introduced a new statement show proc '/global_current_queries'
to view queries running on all FE nodes. show proc '/current_queries'
only shows queries running on the current FE node. #49826
Bug Fixes
Fixed the following issues:
• The source cluster's BE nodes were mistakenly added to the current cluster when exporting data to the destination cluster via StarRocks external tables. #49323
• TINYINT data type returned NULL when StarRocks reads ORC files using select * from files
from clusters deployed on aarch64 machines. #49517
• Stream Load fails when loading JSON files containing large Integer types. #49927
• Incorrect schema is returned due to improper handling of invisible characters when users load CSV files with Files(). #49718
• An issue with temporary partition replacement in tables with multiple partition columns. #49764
Behavior Changes
• Introduced a new parameter object_storage_rename_file_request_timeout_ms
to better accommodate backup scenarios with cloud object storage. This parameter will be used as the backup timeout, with a default value of 30 seconds. #49706
• to_json
, CAST(AS MAP)
, and STRUCT AS JSON
will return NULL instead of throwing an error by default when the conversion fails. You can allow errors by setting the system variable sql_mode
to ALLOW_THROW_EXCEPTION
. #50157
StarRocks/starrocksGitHub
09/09/2024, 8:28 AMBeryl Chen
09/09/2024, 8:53 PMBeryl Chen
09/12/2024, 1:30 PMGitHub
07/02/2025, 2:19 AMenable_trace_historical_node
controls this behavior (Default: false
). #57083
• Storage Volume adds native support for Google Cloud Storage (GCS): You can now use GCS as a backend storage volume and manage and access GCS resources through the native SDK. #58815
### Improvements
• Optimized error messages when creating Hive external tables fails. #60076
• Optimized count(1)
query performance using the file_record_count
in Iceberg metadata. #60022
• Refined the Compaction scheduling logic to avoid delayed scheduling when all subtasks succeed. #59998
• Added JAVA_OPTS="--add-opens=java.base/java.util=ALL-UNNAMED"
to BE and CN after upgrading to JDK 17. #59947
• Supports modifying the kafka_broker_list
property via the ALTER ROUTINE LOAD command when Kafka Broker endpoints change. #59787
• Supports reducing build dependencies of the Docker base image through parameters. #59772
• Supports accessing Azure using Managed Identity authentication. #59657
• Improved error messages when querying external data via Files()
function with duplicate path column names. #59597
• Optimized LIMIT pushdown logic. #59265
### Bug Fixes
Fixed the following issues:
• Partition pruning issue when queries include Max and Min aggregations and empty partitions. #60162
• Incorrect query results when rewriting queries with materialized views due to missing NULL partitions. #60087
• Refresh errors on Iceberg external tables when using partition expressions based on str2date
. #60089
• Incorrect partition range when creating temporary partitions using the START END syntax. #60014
• Incorrect display of Routine Load metrics on non-leader FE nodes. #59985
• BE/CN crashes when executing queries containing COUNT(*)
window functions. #60003
• Stream Load failures when the target table name contains Chinese characters. #59722
• Overall loading failures to triple-replica tables when loading to a secondary replica fails. #59762
• Missing parameters in SHOW CREATE VIEW output. #59714
### Behavior Changes
• Some FE metrics include the is_leader
label. #59883
StarRocks/starrocksMehdi Sidi Boumedine
07/02/2025, 1:03 PMGitHub
07/04/2025, 8:58 AMslow_lock_print_stack
to prevent process stalls in large clusters when printing thread stacks. #59967
• Reduced unnecessary locks during tablet scheduling. #59744
### Bug Fixes
Fixed the following issues:
• SplitOR fails to prune scan columns. #60223
• Incorrect query plan for null-aware left anti joins. #60119
• Incorrect query results when rewriting queries with materialized views due to missing NULL partitions. #60087
• Partition pruning errors when tables contain empty partitions. #60162
• Refresh errors on Iceberg external tables when using partition expressions based on str2date
. #60089
• Unexpected behavior caused by materialized view schema changes. #60079
• Issues related to low-cardinality global dictionaries in UNION operators. #60075
• Incorrect partition ranges for temporary partitions created using the START END syntax. #60014
• Lock issues with SUBMIT TASK. #60026
• Partial updates fail on Primary Key tables under certain conditions. #60052
• Crashes caused by BE failing to create directories due to a lack of permissions to access storage paths. #60028
• Cache failures due to cache key duplication in concurrent scenarios. #60053
• Hive table metadata background refresh failure in Unified Catalog. #55215
• Query failures caused by incorrect return types of CASE WHEN. #59972
• Query failures when Delta Lake tables UNION themselves. #60030
• Partition creation failure when writing to multiple tables within the same transaction. #59954
• Queries could return empty results instead of errors when tablet versions were updated during execution. #53060
• Queries against modified columns in a table return null after upgrading to v3.4. #59941
• Authentication information is printed in logs. #59907
• Metadata refresh failures for external tables in Hive Catalog. #54596
• CACHE SELECT failures for tables after schema changes. #59812
• Broker Load could not recover after FE Leader shifts. #59732
• Stream Load failures when the target table name contains Chinese characters. #59722
• Incorrect query results in external tables due to search key hash collisions (affecting Iceberg/Delta/Paimon). #59781
StarRocks/starrocksBeryl Chen
07/09/2025, 11:01 PMGitHub
07/11/2025, 5:54 AMinformation_schema.loads
view. Users can view the execution details of all INSERT, Broker Load, Stream Load, and Routine Load subtasks in this view. Additional fields have been added to help users better understand the status of loading tasks and the association with parent jobs (PIPES, Routine Load Jobs).
• Support modifying kafka_broker_list
via the ALTER ROUTINE LOAD
statement.
### Bug Fixes
The following issues have been fixed:
• Under high-frequency loading scenarios, Compaction could be delayed. #59998
• Querying Iceberg external tables via Unified Catalog would throw an error: not support getting unified metadata table factory
. #59412
• When using DESC FILES()
to view CSV files in remote storage, incorrect results were returned because the system mistakenly inferred xinf
as the FLOAT type. #59574
• INSERT INTO
could cause BE to crash when encountering empty partitions. #59553
• When StarRocks reads Equality Delete files in Iceberg, it could still access deleted data if the data had already been removed from the Iceberg table. #59709
• Query failures caused by renaming columns. #59178
### Behavior Changes
• The default value of the BE configuration item skip_pk_preload
has been changed from false
to true
. As a result, the system will skip preloading Primary Key Indexes for Primary Key tables to reduce the likelihood of Reached Timeout
errors. This change may increase query latency for operations that require loading Primary Key Indexes.
StarRocks/starrocksRonit Kapoor
07/18/2025, 5:43 PMGitHub
07/21/2025, 2:11 AMenable_predicate_expr_reuse
to control predicate pushdown. #60603
• Supports a retry mechanism when fetching Kafka partition information fails. #60513
• Removed the restriction requiring exact mapping of partition columns between materialized views and base tables. #60565
• Supports building Runtime In-Filters to enhance aggregation performance by filtering data during aggregation. #59288
### Bug Fixes
Fixed the following issues:
• COUNT DISTINCT queries crash due to low-cardinality optimization for multiple columns. #60664
• Incorrect matching of global UDFs when multiple functions share the same name. #60550
• Null pointer exception (NPE) issue during Stream Load import. #60755
• Null pointer exception (NPE) issue when starting FE during a recovery from a cluster snapshot. #60604
• BE crash caused by column mode mismatch when processing short-circuit queries with out-of-order values. #60466
• Session variables set via PROPERTIES in SUBMIT TASK statements did not take effect. #60584
• Incorrect results for SELECT min/max
queries under specific conditions. #60601
• Incorrect bucket pruning when the left side of a predicate is a function, leading to incorrect query results. #60467
• Crash for queries against a non-existent query_id
via Arrow Flight SQL. #60497
### Behavior Changes
• The default value of lake_compaction_allow_partial_success
is set to true
. Compaction operations can now be marked as successful even if partially completed, preventing blockage of subsequent compaction tasks. #60643
StarRocks/starrocksBeryl Chen
07/22/2025, 4:00 PMBeryl Chen
07/24/2025, 1:45 PMGitHub
07/31/2025, 9:38 AMcpu_core_used_permille
limit in resource groups. #61177
• Conflict between ALTER jobs and partition creation tasks. #61167
• NPE caused by missing globalStateMgr
in ConnectContext
. #60880
• Partition creation failed when partition names matched case-insensitively but had different values. #60909
• Lock competition caused by synchronous access to partition statistics. #61041
• ANALYZE tasks stuck in pending
state after FE restart. #61113
• Issue with JIT (Just-In-Time) compilation in BE. #61060
• Leader address issue in Starmgr. #61016
• CVE vulnerabilities in Broker. #60908
• Actual number of JDBC connections exceeded jdbc_connection_pool_size
limit. #61004
• CVE-2022-41404 vulnerability. #59689
• CVEs related to Parquet and HttpClient5. #58750
• Partition not removed from _partition_map
when physical partition ID was empty. #60842
• Missing version check in shared-data clusters. #59422
• Transaction log missing when publishing logs in batches in shared-data clusters. #60949
• Concurrent publishing of the same transaction when Batch Publish is enabled in shared-data clusters. #57574
• Statistics overwrite issue caused by lack of semi-synchronous mode. #60897
• Inaccurate maxInstantTime
used for filtering Hudi files when retrieving latest merged file slices. #60927
• TaskRun state incompatible with earlier versions. #60438
• CVE-2025-52999 vulnerability. #60795
• Vulnerability caused by log4j-1.2.17-cloudera6
in Broker. #59579
• BE crash when loading OOM partitions. #60778
• Base Compaction tasks blocking other compaction tasks. #60711
• Inefficient handling of error string truncation. #60878
• Materialized view rewrite failed in multi-FE environments. #60841
• INSERT OVERWRITE failed on manually created partitions. #60750
• Issue caused by using random distribution in aggregate keys. #60702
• Crash caused by low cardinality rewrite in multi_distinct_count
. #60664
• Issue with Pivot resolving fields. #60748
• Upgraded hudi-common
to 1.0.2. #59501
• BE crash when CLONE and DROP TABLE run concurrently. #61359
StarRocks/starrocksBeryl Chen
08/05/2025, 7:10 PMGitHub
08/07/2025, 7:58 AMINSERT INTO FILES
, you can now specify the Parquet version via the </StarRocks/starrocks/blob/sql-reference/sql-functions/table-functions/files.md#parquetversion|`parquet.version`> property to improve compatibility with other tools when reading the exported files. #60843
### Bug Fixes
The following issues have been fixed:
• Loading jobs failed due to overly coarse lock granularity in TableMetricsManager
. #58911
• Case sensitivity issue in column names when loading Parquet data via FILES()
. #61059
• Cache did not take effect after upgrading a shared-data cluster from v3.3 to v3.4 or later. #60973
• A division-by-zero error occurred when the partition ID was null, causing a BE crash. #60842
• Broker Load jobs failed during BE scaling. #60224
### Behavior Changes
• The keyword
column in the information_schema.keywords
view has been renamed to word
to align with the MySQL definition. #60863
StarRocks/starrocksGitHub
08/11/2025, 7:30 AMBALANCE
type to cluster balance results. #61081
• Optimized materialized view rewrite for external tables. #61037
• Default value of system variable enable_materialized_view_agg_pushdown_rewrite
is changed to true
, enabling aggregation pushdown for materialized view queries by default. #60976
• Optimized partition statistics lock competition. #61041
### Bug Fixes
The following issues have been fixed:
• Inconsistent Chunk column size after column pruning. #61271
• Synchronous execution of partition statistics loading may cause deadlocks. #61300
• Crash when array_map
processes constant array columns. #61309
• Setting an auto-increment column to NULL results in the system mistakenly rejecting valid data within the same Chunk. #61255
• The actual number of JDBC connections may exceed the jdbc_connection_pool_size
limit. #61038
• FQDN mode did not use IP addresses as cache map keys. #61203
• Array column cloning error during array comparison. #61036
• Deploying serialized thread pool blockage led to query performance degradation. #61150
• OK hbResponse not synchronized after heartbeat retry counter reset. #61249
• Incorrect result for the hour_from_unixtime
function. #61206
• Conflicts between ALTER TABLE jobs and partition creation. #60890
• Cache does not take effect after upgrading from v3.3 to v3.4 or later. #60973
• Vector index metric hit_count
is not set. #61102
• Stream Load transactions fail to find the coordinator node. #60154
• BE crashes when loading OOM partitions. #60778
• INSERT OVERWRITE failed on manually created partitions. #60750
• Partition creation failed when partition names matched case-insensitively but had different values. #60909
• The system does not support PostgreSQL UUID type. #61021
• Case sensitivity issue with column names when loading Parquet data via FILES()
. #61059
StarRocks/starrocksBeryl Chen
08/14/2025, 1:45 PMBeryl Chen
08/15/2025, 6:40 PMRonit Kapoor
08/19/2025, 11:41 PMGitHub
08/22/2025, 11:48 AMprepared_timeout
configuration to Stream Load Transaction Interface. #61539
• Upgraded StarOS to v3.5‑rc3. #61685
### Bug Fixes
The following issues have been fixed:
• Incorrect Dict version of random distribution tables. #61933
• Incorrect query context in context conditions. #61929
• Publish failures caused by synchronous Publish for shadow tablets during ALTER operations. #61887
• CVE‑2025‑55163 issue. #62041
• Memory leak in real-time data ingestion from Apache Kafka. #61698
• Incorrect count of rebuild files in the lake persistent index. #61859
• Statistics collection on generated expression columns causes cross-database query errors. #61829
• Query Cache misaligns in shared-nothing clusters, causing inconsistent results. #61783
• High memory usage in CatalogRecycleBin due to retaining deleted partition information.#61582
• SQL Server JDBC connections fail when the timeout exceeds 65,535 milliseconds. #61719
• Security Integration fails to encrypt passwords, exposing sensitive information. #60666
• MIN()
and MAX()
functions on Iceberg partition columns return NULL unexpectedly. #61858
• Other predicates of Join containing non‑push‑down subfields were incorrectly rewritten. #61868
• QueryContext cancellation can lead to a use‑after‑free situation. #61897
• CBO’s table pruning overlooks other predicates. #61881
• Partial Updates in COLUMN_UPSERT_MODE
may overwrite auto-increment columns with zero. #61341
• JDBC TIME type conversion uses an incorrect timezone offset that leads to wrong time values. #61783
• max_filter_ratio
was not being serialized in Routine Load jobs. #61755
• Precision loss in the now(precision)
function in Stream Load. #61721
• Cancelling a query may result in a “query id not found” error. #61667
• LDAP authentication may miss PartialResultException, causing incomplete query results. #60667
• Paimon Timestamp timezone conversion issue when the query condition contains DATETIME. #60473
StarRocks/starrocks