liv: ribbon diagram of a p53 monomer (p53)
[personal profile] liv
I have a spreadsheet with approx forty thousand rows. Around 6000 of them are irrelevant - they're mixed in with the rest, but are identifiable based on data in one of the columns. The data covers three years. The years are not recorded as proper dates, but as plain text saying things like 2015/16.

My task is that for each year, I need to count the unique values in one of the columns. This column contains only text, no blanks. It's not made out of meaningful English words, but serial numbers containing letters and digits. The values in each column are repeated anywhere from 1 to 60 times; I just want to know how many different serial numbers there are, not overall, but separately for each year.

Year Serial Number Flag
2015/16 AAA111 Relevant
2015/16AAA128 Irrelevant
2016/17 AAA111 Irrelevant
2016/17 AAB139 Relevant
2016/17 AAA111 Relevant


My own knowledge of Excel, supplemented by searching for things like Excel count unique values, isn't quite sufficient. Anything that involves doing this semi-manually (eg sorting the columns then counting by hand) is unfeasible over tens of thousands of rows. Anything automated needs to not make Excel choke with a large-ish spreadsheet.

I tried following the instructions in this official Microsoft article, and it's not quite working for my case. As soon as I try to filter a column by unique values only, it overrides the filter I started with for taking out the 6000 rows marked 'irrelevant' in a different column. Even worse than that, instead of copying the roughly 10,000 cells in the same column that I tried to select, it copies a whole chunk of the spreadsheet with several columns and I'm not sure exactly how it's related to the area I selected. The second method, count using functions, I don't understand well enough to try, and since the method I thought I understood behaved very unexpectedly, I don't want to start blindly pasting in a formula I really won't be able to debug.

Does anyone have any suggestions for how to approach this?

(The reason why I'm trying to wrangle this myself rather than delegating it to someone who has relevant expertise is, well, annoying work politics. But the fact remains that I need to do it.)

Soundbite

Miscellaneous. Eclectic. Random. Perhaps markedly literate, or at least suffering from the compulsion to read any text that presents itself, including cereal boxes.

Page Summary

Top topics

December 2025

S M T W T F S
 123456
78910111213
14151617181920
21222324252627
282930 31   

Expand Cut Tags

No cut tags

Subscription Filters