Column Statistics

Column Statistics is activated by clicking in a table, then selecting Data -> Column Stats.

We are presented with a panel on the right of the screen that displays some interesting information about our data.

We can immediately see how many times each item appears in the data by viewing the Count statistics.

We can also see a list of the most and least frequently occurring items in the data.

An interesting behavior occurs when an item is selected from the Frequency list, all matching items in the table are highlighted.

!!! BEWARE !!!

What is likely to be an undesirable behavior is the Column Stats inability to recognize header rows.

A method to educate Sheets as to the presence of a header row (or simply unwanted rows before the data) is to freeze the rows between the header row and the data.

The Frequency and Summary statistics will not refresh automatically, so you will need to close and reopen the Column Stats feature.

Unfortunately, the header row value remains in the column chart for Count and Distribution.  Perhaps this will be corrected in a future update.

Analyzing Other Columns

With the Column Stats feature activated, you can analyze the stats of other columns by either clicking in a column of the data or by selecting the < and > buttons at the top of the Column Stats panel.

When selecting a column of numbers, we can see the Distribution of those values.

Hovering your mouse pointer over a column reveals the count of items that qualify in the defined range.

Selecting a column containing numbers reveals additional statistically oriented aggregations.

NOTE:  Because the entire column in the worksheet is being analyzed, the Total Rows is likely to be inaccurate when compared with the actual number of used rows.

With any luck, updates to this feature will include the ability to define a specific range of cells instead of defaulting to the entire columns.

Cleanup Suggestions

Cleanup Suggestions assists with the cleanup of data before analysis.  This reduces the number of errors in the final report.

Smart Cleanup is activated by clicking in a table, then selecting Data -> Cleanup Suggestions.

This loads the Cleanup Suggestions panel to the right displaying any areas of the data that it believes may cause issues during the analysis phase.

Duplicate Rows

We can see that we have duplicate rows in the data.  If this is an issue, hovering over the entries in the list will reveal which items will be kept (green) and which will be discarded (red) if the REMOVE action is invoked.

Trim Whitespace

We also see if there are any extraneous spaces before, after, or in the middle of the data.

Selecting the green checkmark button will accept all suggested modifications while selecting Trim All on a per-column basis allows for more focused correction.

Inconsistent Data

When working with repetitive data, items that frequently occur but with minor variations in spelling will be flagged as Inconsistent Data.

By selecting the green checkmark, the suggested spelling will be applied to all flagged cells.

BEWARE: Not all instances of inconsistent data are discovered, even when a refresh is performed.  This is likely due to the relative infancy of the feature and will no doubt discover more inconsistencies as the feature matures.

Leila Gharani

I'm a 6x Microsoft MVP with over 15 years of experience implementing and professionals on Management Information Systems of different sizes and nature.

My background is Masters in Economics, Economist, Consultant, Oracle HFM Accounting Systems Expert, SAP BW Project Manager. My passion is teaching, experimenting and sharing. I am also addicted to learning and enjoy taking online courses on a variety of topics.