Data collection is the gathering of information from various sources, and data analysis is the processing of information to get useful insights from it. Furthermore, data analysis involves systematically employing statistical and/or logical methods to illustrate, describe, summarize, and evaluate data.
There are two types of data: primary and secondary.
Primary Data
The data that is collected from the selected population, e.g., respondents, by using different tools of data such as interview schedule, interview guide, questionnaire, observation, and focus group discussion is called primary data. It is also called first-hand information. Primary data is the type of data that has not been around before. Primary data is the unique findings of your research. Primary data collection and analysis typically require more time and effort than secondary data research.
The collecting of primary data can be categorized into two distinct groups: quantitative and qualitative.
Quantitative Data Collection Methods
Collecting quantitative data depends on mathematical calculations in different formats. Quantitative data collection and analysis involve various techniques such as questionnaires and interviews with closed-ended questions, correlation and regression methods, and calculating measures like mean, median, and mode.
Qualitative Data Collection Methods
Collecting qualitative data, however, does not require numerical data or mathematical calculation. Qualitative research is closely linked to the subjective aspects of human experiences, such as words, feelings, sounds, emotions, colors, and other non-quantifiable elements. Some common methods for collecting and analyzing qualitative data are interview guides/in-depth interviews, focus group discussions, and observation.
Your selection between qualitative and quantitative methods of data collection depends on the area, nature, and objectives of your research.
Secondary Data
The data that has already gathered and available from other sources is called secondary data. This data is available at a low cost and can be promptly obtained as compared to the primary data. The secondary data is used when primary data is not available or cannot be quickly approached.
The data has already been published in various sources such as books, journals, newspapers, magazines, and online portals. These sources contain an abundance of data related to your research area, regardless of its nature. Hence, the careful selection of secondary data is crucial for strengthening the validity and reliability of research. The literature review provides a more comprehensive analysis of secondary data collection.
Secondary data collection methods offer a range of advantages, such as saving time, effort, and expenses. However, they have a major disadvantage, such as secondary research not making a contribution to the expansion of the literature by producing fresh (new) data.
Secondary Sources
The following list gives some idea of possible secondary sources, grouped into categories:
Organizational Publications
Numerous government and non-government organizations diligently gather data on a regular basis across various domains and make it accessible to the public and interest groups. There are several common examples that provide valuable information, such as the census, birth and death registration, health reports, labor force surveys, demographic information and economic forecasts.
Earlier Research
For some topics, a vast array of research studies that have already been done by others can provide you with the required information.
Personal Records
Some people write historical and personal records (e.g., diaries) that may provide the information you need.
Mass Media
Reports, blogs, and articles published in newspapers, magazines, the internet, and so on may be other good sources of data.
All qualitative, quantitative, and mixed-methods research studies can use secondary sources as a method of data collection. In qualitative research, researchers usually extract descriptive and narrative information (such as information from historical accounts of an event, descriptions of a situation, stories about beliefs and superstitions, or descriptions of a site). In quantitative studies, the information is usually extracted in numerical or categorical form.
Problems with Secondary Data Sources
You need to be careful when using data from secondary sources because the data may not be available or may not be in the right shape or quality. These problems vary from each source. When you use this kind of data, here are some aspects you should remember:
Validity and Reliability
The validity of information may vary significantly from source to source. For example, information obtained from a census is likely to be more valid and reliable than that obtained from most personal diaries. Validity and reliability are very important concerns in research, and they cannot be taken for granted. Some secondary sources are reliable as primary sources, like the census as it covers the whole population.
Other sources might not be as reliable, and they should only be used when no other data is available. Valid means that the data represents original and true findings and has been collected using scientific methods. While using secondary sources of information, it should be well-researched to ensure that the content is genuine and authentic.
Personal Bias
In secondary sources the chances of bias are higher than in primary sources. Some secondary sources, like personal records, can be highly biased, and they may not be. Personal diaries and other records like newspapers and mass media products can be biased.
Newspapers, magazines, and websites do not use rigorous and well-controlled methods in documentation. Most of the time, such writings are opinion-based and they are far from facts. In these publications, writers can distort the facts to make the situation look better or worse.
Availability of Data
Many new researchers often assume that the necessary data will be readily accessible, but it is important to avoid making this assumption. It is important to ensure that the required data is accessible before continuing with your study. Secondary sources are usually preferred in the research because of their ease of availability. If it is hard to collect data using secondary sources, the researcher should not use it.
Format
Before deciding to use data from secondary sources, it is equally important to ascertain that the data is available in the required format. For example, you might need to analyze age in the categories 23-33, 34-48, and so on, but in your source, age may be categorized as 21-24,25-29, and so on.
Obsolete Data
Sometimes secondary sources are available to be used in the research, but they are very old. Old data is of no use to be used in the research. You cannot use a book that was written 20 years ago; the data present in that book will be valid and reliable at the time when it was written, but taking the current circumstances, it is obsolete.
Libraries are flooded with books that contain data related to your research, but you have to check the date of publication to determine whether you can use it or not. In most cases, data from up to 5 years can be used in the research. Only historical data can be used forever because it represents history that cannot be researched in other ways.