Are you losing brain cells (and money) in Excel cells?
Typical Background Situation
Excel and other similar applications have been used for many years and are very popular today. Excel is easy to use, it is handy, and you do not need to take training to start using it. It is one of the first applications that many businesses use for financial calculations, planning, reporting and even for keeping track of expenses and income.
Indeed, Excel has served a lot of organisations well since the 1980s. You have probably used it more than most of the applications, if you work with numbers. At the same time, do you ever feel that Excel does not satisfy your needs anymore and you are tired of it? Do you feel that you have too much stress struggling with Excel files? It doesn’t mean that Excel is not good, it probably means that you have outgrown it. As you and your business grow, your software tools must also grow, otherwise your growth may be hindered.
Can you think of the reasons why you were not satisfied with Excel or even annoyed by it? Let’s name some.
1) A lot of mistakes. Mistakes in data, in formulas, in macros. You forget to put a “$” sign in formula before a cell’s address, you copy this cell, and get wrong results. You insert a row and forget to change a range in the SUM function, and you get wrong results. You enter a number in a cell which is formatted as text, and you get wrong results. Can you relate to all of this? If you have had many mistakes in your spreadsheets, you are not alone. Raymond Panko from the University of Hawaii has conducted an interesting study on spreadsheet errors. According to his findings, the average spreadsheet cell error rate is 3.9%. It looks small, but spreadsheets often contain many dependent cells, and the probability of an error in a spreadsheet increases rapidly. For example, given such a small cell error rate the probability of an error in a spreadsheet with 100 cells in cascades will be 98%! These findings are consistent with other studies which found errors in about 94% of spreadsheets. These numbers are shocking. Would you trust reports based on spreadsheets knowing that the probability of them containing errors is more than 90%? Decisions made on wrong data can cost a lot of money (and nerve cells). For example, In 2013 JP Morgan’s $6 billion loss was partly because of an Excel’s copy-and-paste error.
2) Time-consuming. When your files get bigger you notice how slowly Excel opens and saves them, how slowly it calculates formulas, copies cells, refreshes filter results, inserts cells, etc. Sometimes you can even get an “Out of memory” error. This especially happens when you have a lot of observations and variables or when you need to aggregate data from different sources. What if you did not save your file and Excel “froze”? You lose information, time and you lose money since time is money. Too much time spent on data aggregation, error checking and updating the data in Excel. Are you familiar with situations when it takes hours (or even days) to get a report?
3) Not suitable for collaboration. Editing by multiple users, emailing Excel files back and forth can multiply errors and create many versions of files, and you get lost in versions sometimes not knowing where is the last version. You can use the wrong “last” version because there is already “last_last2!” version somewhere. And probably the Excel file will contain errors after several people have made changes to it. In 2017 errors on a pair of Clallam County Sheriff’s Office budget documents cost the county $494,157. The spreadsheets were emailed back and forth between offices, and because of some cutting and pasting, not all the formulas were pasted correctly. Yes, having no good tools for a collaborative work can cost you a lot of money.
4) Vulnerability to fraud, corruption and leak of information. This is perhaps the most damaging issue. Fraudulent manipulations in Excel files have already resulted in great losses. It is so easy to alter either formulas, values, or dependencies without being detected. It is so easy to delete a file or a sheet, email a wrong file or lose a flash drive with confidential information. Even passwords and sheet protection cannot solve this problem and avoid losses. In 2009 the Clallam county treasurer’s office has made a discovery of at least $617,467 in missing funds. The cashier who was accused of stealing the funds hid the rows in a spreadsheet to cover up theft. No doubt, the county’s officials lost a lot of brain cells after this discovery.
5) Out of control. If there is no unified platform to manage and analyze data and Excel is the primary tool for data storage, analytics and reporting, many people are actually involved in creating a solution that fits the needs of an organization. These people have different levels of technical skills and different preferences, spreadsheets are multiplying rapidly, and even if you have some skilled person to manage information flows, things can get out of control easily. You may end up with a lot of files with similar information though not matching one another, files with different formatting, files with different versions. To create a report in such a situation you need to ask: “What files contain the right information? Who has the last version? Was information updated lately? How to aggregate data stored in different Excel files and in different locations?” But managers do not want their financial people and analysts to spend time on aggregating data from different sources in Excel, but rather to focus on drawing conclusions from data.
6) Computational flaws and limited analytics. Do you think you always get the right results with Excel? Do you think you can rely on data analysis tools built in Excel? Not quite. Are you surprised? There are so many flaws in Excel that can result in wrong decisions based on wrong results. Many computational flaws have been documented over the years (see for example, Jeffrey Simonoff’s “Statistical analysis using Microsoft Excel” article which has many other references and Eva Goldwater’s “Using Excel for Statistical Data Analysis – Caveats” article). Some of them are fixed, but some of them are still there. There is even an organisation which is devoted to providing information on Spreadsheet Risk Management, and they run annual conferences on this topic!
Let’s see some examples. Try to do this in Excel:
- Enter number 10 in the A1 cell. Enter the following formulas in the B1 and C1 cells respectively: “=-A1^2+100” and “=100-A1^2”. Do you think you will get the same results? No! The first formula gives you 200, while the second one gives you 0. Quite a difference, isn’t it?
- In the A1:A10 range enter numbers, but leave couple cells blank (these will be our missing values). In the B1:B10 cells make a reference to the cells in the A column. You are supposed to get the same numbers as in the A column except for missing values – they are converted to 0. Now let’s calculate an average in each column – enter the following formulas in the A11 and B11 cells: “=AVERAGE(A1:A10)” and “=AVERAGE(B1:B10)”. Do you expect to get the same results since you have just referenced your data in another column? No! Working with missing values in Excel can be misleading.
- Let’s try some data analysis. Enter some numbers in the A1:Z50 range. Let column A represent Y, and columns B through Z represent X. Let’s run a simple linear regression. Choose Data Analysis Tools – Regression. Enter Y range and X range and click “Ok”. Excel says that “Input range X cannot contain more than 16 variables (columns)”. Are you disappointed? Did you want to use more than 16 variables? Well, Excel is not a very good tool for data analysis.
- Try to enter SEP7, SEPT2 or MARCH1 in a cell. Excel converts them to dates. But what if these are gene names, and you did not expect them to be converted into dates! Gene name errors are widespread in the scientific literature. One study reveals that approximately one-fifth of papers with supplementary Excel gene lists in leading genomics journals contain erroneous gene name conversions.
7) High dependence on IT people for complex tasks. Yes, spreadsheets are intuitive and easy to use, but only for simple tasks within one small document. But what if you need to create complex formulas, dependencies or reports? It requires special skills. If you, being a business leader, want to receive a report in Excel, you need to know how to use formulas, pivot tables and even how to code. Or you can ask an IT person to do a report. But what if this person is on vacation or several different managers need different reports simultaneously? Will you wait? It does not seem efficient, and again you lose money and get stressed out.
So, do you think that you have reached a point when you have much stress because of Excel? You probably will never get rid of Excel completely in your organisation (though some try to stop using it and cut it out of their processes like Adobe’s Financial Chief, but maybe it is time to make a small step forward which may become a giant leap for your organization as you manage your data better, faster, safer and have easier access to analytics with other tools like Power BI, TIBCO’s Spotfire, ElasticSearch or even SAS. Are you ready for growth? Do you have the tools that will manage your growing data and satisfy growing demands? Are you ready to excel going beyond Excel?