Should researchers conduct a exploratory data analysis?

Before we consider the arguments to this question, we must understand what an exploratory data analysis is. Then we can discuss whether researchers should conduct such an analysis.

An exploratory data analysis (EDA) is a way to analyse sets of data, whereby a summary of the main characteristics can be put into an easy to understand form, often using graphs to do so.

There are four main objectives of the EDA:

  • to suggest hypotheses about the causes of an observed phenomena.
  • to support the selection of appropriate statistical tools and techniques.
  • to provide a basis for further data collections through surveys or experiments.
  • to assess assumptions on which statistical inference will be based.

What’s more, EDA uses a variety of graphical and quantitative techniques, including: histograms, multi-vari chart, scatter plots, ordination and rootograms. This gives the researchers the benefit of being able to find the appropriate statistical test for their data.

From the outset, the EDA appears to be well structured and useful form of analysis. And in essence it is. It follows a scientific method in its objectives and various statistical tests can be brought in to help carry out an analysis. It also has the added bonus of being able to take a complex set of figures a simplify the data to make it comprehendable. It is no surprise that it has become a popular with many researchers.

The EDA lends itself very well to researchers working with quantitative data sets. However, for qualitative research, this approach is not advantageous. The most obvious reason is that there are few qualitative procedures that work efficiently with EDA. This would suggest that  the EDA, although well-structured and useful. It uses are only beneficial for researchers working with quantitative data sets.

To conclude, even though the EDA suffers from not being of great use to researchers using qualitative data sets, for those who’s data it does suit, the EDA offers many advantages. It is of my opinion that researchers working with quantitative data sets should conduct an exploratory data analysis.

Advertisements

4 responses to this post.

  1. Posted by psuc9f on February 8, 2012 at 8:02 pm

    I have enjoyed reading this blog, however it would have been improved if you were to further explain why EDA does not work well with qualitative methods and which methods it does correlate with. For example; certain styles of questionnaires used today during field studies tend to use a likert scale of measurement of emotion before asking the participant to expand on their answer. Such scales can be statistically analysed and produced within a range of graphs to show their relevance to the hypothesis

    Reply

  2. I find myself somewhat agreeing with your blog in that researchers should indeed carry out EDA, just to see what kind of figures they get. However I see no point in publishing or even mentioning the data they get unless it happens to be particularly interesting or unless that was the orginally intended method of analysis. Abt (1987)* suggested that EDA should only be used when the researcher is trying to establish an hypothesis for future research. Abt went on to say that there is to large a gap between the uses of EDA and comparative data analysis (t tests, ANOVAs, etc.) and suggested a third method of analysis – descriptive data analysis, which could be used to fill the gap. The link for this paper is at the end of my comment should this be an area of particular interest to you.

    *http://www.schattauer.de/en/magazine/subject-areas/journals-a-z/methods/contents/archive/issue/1226/manuscript/14675/download.html

    Reply

  3. I agree with you that EDA is a great tool for researchers, and is excellent for creating hypotheses to go on to test. It really helps researchers to understand their data and provides a deeper look into what the data really means. It is also useful for exluding outliers, and other such tasks. I agree that EDA is not really applicable to most qualitative data sets, and is mainly useful when analysing quantative data sets, and i also believe that EDA should be conducted whereever possible.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: