explain different data representation methods?
Answers
Explanation:
There are many different ways of representing longitudinal data structures
Some depend on the nature of the data ...
...or on the nature of the analysis
Others are largely equivalent
Let's take pure panel information: discrete evenly spaced state observations
One file per wave, with identical structure, identified by PID
One file, one record per respondent, identified wave by variable name/position
One file, one record per respondent per wave, identified by PID and a wave-number index variable.
These are equivalent in their information content:
We can move between them relatively easily (especially between types 2 and 3, in Stata and later versions of SPSS)
But differ in their ease of use for different purposes.
For example, to cross-tabulate a variable for a given pair of waves, type 2 is clearly better.
However, if you want to cross-tabulate current status with last year's status, pooling across waves, type 3 is better.
Status history data is relatively simple in principle: there is an observation for each time unit per person
An easy way to represent is as a wide horizontal file: one variable per time unit
Broadly equivalent is a long vertical file: one record per person-time-unit
The practical complication is combining waves
If (as in ECHP design) the reference period is a calendar year, some respondents do not report their recent experience
If (as in BHPS) a variable length reference period is used there will be overlap
With overlap, a decision for the analyst: which report to accept?
In SPSS, handling wide `calendars' by VECTOR/LOOP is straightforward
In Stata, handling long vertical files is easy
Event history data is a little more complicated
An efficient representation is to record the dates and destinations of all transitions: this is a pure event history (the act of observation must be recorded as an event)
Closely related is spell or episode history: store start of spell, state and end-date (including `on-going at time of observation' or `censored')
However, for many purposes event/episode data can be transformed into state histories, with a variable per time unit
This can be wasteful, if the average spell length is much greater than one time unit: long strings of the same data
A bigger problem is that it loses information: for instance two successive jobs with the same characteristics look like one long job.
It's also harder to think in spell terms (how long, when did this spell end/start)
But if you need to relate status in many domains, it's very convenient (e.g., you want to know job status and marital status at a particular time)
Data and instructions cannot be entered and processed directly into computers using human language. Any type of data be it numbers, letters, special symbols, sound or pictures must first be converted into machine-readable form i.e., binary form. Due to this reason, it is important to understand how a computer together with its peripheral devices handles data in its electronic circuits, on magnetic media and in optical devices.
Data representation in digital circuits
Electronic components, such as microprocessor, are made up of millions of electronic circuits. The availability of high voltage(on) in these circuits is interpreted as ‘1’ while a low voltage (off) is interpreted as ‘0’. This concept can be compared to switching on and off an electric circuit. When the switch is closed the high voltage in the circuit causes the bulb to light (‘1’ state). On the other hand, when the switch is open, the bulb goes off (‘0’ state). This forms a basis for describing data representation in digital computers using the binary number system.
Data representation on magnetic media
The laser beam reflected from the land is interpreted, as 1. The laser entering the pot is not reflected. This is interpreted as 0. The reflected pattern of light from the rotating disk falls on a receiving photoelectric detector that transforms the patterns into digital form. The presence of a magnetic field in one direction on magnetic media is interpreted as 1; while the field in the opposite direction is interpreted as “0”. Magnetic technology is mostly used on storage devices that are coated with special magnetic materials such as iron oxide. Data is written on the media by arranging the magnetic dipoles of some iron oxide particles to face in the same direction and some others in the opposite direction.
Data representation on optical media
In optical devices, the presence of light is interpreted as ‘1’ while its absence is interpreted as ‘0’. Optical devices use this technology to read or store data. Take example of a CD-ROM, if the shiny surface is placed under a powerful microscope, the surface is observed to have very tiny holes called pits. The areas that do not have pits are called land.