Power BI Performance Tuning (2 of 3)

Optimizing Data Refresh

In Part 1, we established the foundation for any successful tuning engagement by understanding the solution's background and assembling our toolkit. With that critical context in hand, we now turn to the first of two primary optimization areas: semantic model refresh performance.

Why start with data refresh first? Because improving refresh performance creates benefits that ripple through everything else you'll do. Many optimizations that speed up refresh -- like reducing the amount of data loaded -- also speed up the user interface. By forcing your solution to load less data, you create a smaller, more stable data model that opens faster, saves faster, and is easier to work with. Additionally, faster refresh cycles during development mean you can iterate on UI improvements more quickly. Ultimately, this approach forces you to really understand your data model before jumping into the front-end, laying a critical foundation for every subsequent optimization.

Establishing Your Baseline Metrics

Before making any changes, you must establish baseline metrics for your solution's performance. Without these, you are just guessing whether anything actually helped. You'll capture these metrics before making changes and after each optimization iteration.

Data Model Size in Memory

Launch DAX Studio from the External Tools ribbon in Power BI Desktop. Navigate to the Advanced tab, click View Metrics, and select OK. On the VertiPaq Analyzer tab, note the total memory footprint. You can also export these metrics to a VPAX file to document the table and column sizes, which will be essential for validating your changes.

Refresh Execution Time

Launch SQL Server Profiler (available with SQL Server Management Studio) or access it from DAX Studio. Start a new trace and configure it to capture the Job Graph and Progress Report End events. Initiate a refresh of your semantic model. The profiler will record the start and end times for each operation. This data is invaluable for understanding how individual queries contribute to the overall reload time and for prioritizing which tables to focus on.

The Optimization Checklist

With your baseline established, you can begin the systematic process of optimization. This checklist follows the logical flow of data—from source to the final tables—and provides a methodical approach to identifying and fixing bottlenecks.

Data Source and Connectivity

Optimize Network Location: Measure network latency between your data source, any required data gateway, and the Fabric region. The goal is to minimize latency.

Move to an Optimized Data Source Format: If you're pulling from flat files or web sources, consider migrating to a database, which enables faster filtering and aggregation.

Use Native Connectors: Where possible, replace generic ODBC connectors with native Power BI connectors, as they often enable query folding and reduce overhead.

Review Gateway Configuration: Monitor the performance of your data gateway during refresh windows. Bottlenecks with CPU, RAM, or network may require increasing resources or adding nodes to a gateway cluster.

Capacity Constraints

Monitor Capacity Utilization: Use the Fabric Capacity Metrics app to track Capacity Unit (CU) consumption during your refresh windows. If your refreshes are being throttled or are queuing, it's a sign that you're hitting capacity limits.

Schedule Refreshes Wisely: Schedule your refreshes during low-usage periods to avoid resource contention.

Approach & Model Structure

Evaluate Model Split: For very large or complex models, consider splitting them into smaller, more focused models by business domain.

Optimize Storage Mode: While a pure import model is the most common, for extremely large tables, consider DirectQuery, DirectLake, or a composite model with aggregations to reduce the amount of data that needs to be refreshed.

Disable Auto Date/Time Tables: Check if this setting is enabled in Power BI Desktop. These hidden tables are automatically created for every date column and can significantly increase your model size.

Tables and Columns

Remove Unused Columns: Use a tool like Measure Killer to scan your model and remove any columns not used in visuals, measures, or relationships. This can dramatically reduce the model's size in memory.

Reduce Column Footprint: Look for columns that consume a lot of storage or have a high number of distinct values. Examples include using integer identity columns instead of GUIDs, or separating date and time into different columns if the time isn't needed.

Implement a Star Schema: This design pattern strikes the best balance between performance and flexibility for most BI scenarios.

Reduce Rows Per Table: Filter your fact tables to only the relevant date ranges or row types required for analysis. Summarize the table using only the columns you need.

Transformations

Optimize Source Query Performance: If you are using a SQL data source, review the source queries' execution plans. Ensure your source tables have proper indexing.

Reorder Power Query Steps: This is the most crucial transformation optimization. Steps that enable query folding—pushing work to the data source—should always come first. Reorder your transformations so that filtering, column removal, and basic calculations occur before any steps that break folding.

Push Work Upstream: For complex logic that breaks query folding but could be handled in native SQL, consider creating SQL views or using a native SQL query in Power Query to have the source database perform the work.

Incremental Refresh

Enable Incremental Refresh: For large fact tables that are append-only or have a predictable update pattern, configure incremental refresh. This allows you to process only new or recently updated data, which can reduce refresh times from hours to minutes.

Validating Your Improvements

After implementing optimizations, it's essential to verify that your reports still function as expected. Compare key metrics, ensure all visuals and interactions work correctly, and confirm that your refresh success rates remain stable over several cycles. Document your results using the baseline metrics you established earlier. This proves your work had a tangible impact and creates a repeatable methodology for future optimization efforts.

Now that our data is loading efficiently, the final step is to ensure a fast, responsive experience for our end users. In Part 3, we'll shift our focus to tuning the report itself.

What we do