Fluent Essays - Unit VI discussion board

UNIT VI DISCUSSION BOARD

Unit VI discussion board - Operations Management

PLACE ORDER

Please make sure that it is your own work and not copy and paste off of someone else article or work. Please watch out for spelling errors and grammar errors. Please read the study guide. Please use the APA7th edition format. Book Reference: Gray, D. E. (2020). Doing research in the business world (2nd ed.). SAGE. https://online.vitalsource.com/#/books/9781529700527 Pre-test/post-test quantitative designs have been the subject of criticism among methodologists. What practical steps do you think a researcher can take to address the limitations of such a design if it is the only one available? RCH 7301, Critical Thinking for Doctoral Learners 1 Course Learning Outcomes for Unit VI Upon completion of this unit, students should be able to: 4. Assess theoretical research methodologies in contemporary business scholarship. 4.1 Discuss a population and sampling frame for a given scenario. 4.2 Justify the use of a selected sample. 7. Implement a critical thinking process for business research methodology. 7.1 Describe a valid and reliable research instrument. 7.2 Compose an appropriate research design for a study. 8. Compose scholarly business research writing. 8.1 Compose a response to issues and questions surrounding quantitative research methods. Course/Unit Learning Outcomes Learning Activity 4.1 Unit Lesson Chapter 6 Chapter 24 Unit VI Assignment 4.2 Unit Lesson Chapter 6 Chapter 24 Unit VI Assignment 7.1 Unit Lesson Chapter 6 Chapter 24 Unit VI Assignment 7.2 Unit Lesson Chapter 6 Chapter 24 Unit VI Assignment 8.1 Unit Lesson Chapter 6 Chapter 24 Unit VI Assignment Required Unit Resources Chapter 6: Quantitative Research Design Chapter 24: Analysing and Presenting Quantitative Data UNIT VI STUDY GUIDE Quantitative Research Design: Exploration RCH 7301, Critical Thinking for Doctoral Learners 2 UNIT x STUDY GUIDE Title Unit Lesson Quantitative Research Design Quantitative research measures and defines elements through the collection of data, the analyzation of data, and the application of the data to a theoretical framework. Quantitative research design can be categorized into four main types, which are listed below: • descriptive where a subject is measured once; descriptive quantitative research establishes associations between variables; • correlational where the relationship between study variables is investigated; • quasi-experimental where any cause-and-effect relationship is determined; and • experimental where a subject is measured before and after the treatment and where any cause-and- effect relationship is determined (Drummond & Murphy-Reyes, 2018). The differences among the four types have to do with the amount of control that the researcher designs for the variables in the experiment or study. Quantitative research makes use of tools (e.g., graphs, linear regressions, hypothesis testing) to organize and analyze the gathered data. Researchers gather data from quantitative studies via experimentation (i.e., where an independent variable’s effects on a dependent variable are measured) or through surveys, which are designed along a rating scale. Because the focus of questions for a quantitative study is small, the quantitative study can be very narrow and limited in scope. That is both a strength and a weakness. A quantitative study on a very focused sample can yield reliable data about that group and research question, and the study can be replicated elsewhere to test a theory or hypothesis again. However, the collection of data and a focused sample size can also mean that the study’s results or conclusions are not applicable over a wider area or grouping of people, and, therefore, can have limited use unless the study is replicated repeatedly to support the findings. Data trustworthiness is determined by the credibility of the data collection, the data’s transferability, the data’s dependability, and the data’s confirmability. Descriptive Quantitative Research A researcher who designs a descriptive study wants to know the nature of how things are as they are. Descriptive quantitative research either identifies the characteristics of a phenomenon or explores correlations among phenomena. In terms of survey research, which is the most commonly deployed type of descriptive research, the researcher seeks to describe the characteristics of a larger population. Descriptive research examines phenomena as they are and does not involve changing a situation that is being investigated. Since the researcher does not practice control over any variables in the study design, descriptive research cannot be used to determine cause-and-effect relationships. A descriptive research study might employ data collection strategies such as sampling, observing, or interviewing, which take on specific forms when the researcher wants them to yield quantitative data. Descriptive research designs include observation studies, correlational research, development studies, and survey-based research (Oakshott, 2019). All of these designs yield data that can be worked on through statistical analysis. Within the designs, survey-based research is the most commonly used type of descriptive quantitative research. Correlational Quantitative Research According to Creswell and Creswell (2018), a correlational study can examine the extent to which differences in one characteristic or variable are related to differences in one or more other characteristics or variables. A correlation exists if the dependent variable increases (moves toward +1.0) or decreases (moves toward -1.0) in a predictable fashion when the independent variable increases. Correlational research seeks to establish a relationship between variables that do not readily lend themselves to experimental manipulation or control. RCH 7301, Critical Thinking for Doctoral Learners 3 UNIT x STUDY GUIDE Title In a simple correlational study, a researcher gathers data about two or more characteristics of a study population. The numbers that are used reflect measurements of the characteristics, such as customer satisfaction ratings between two locations, employee satisfaction ratings with and without a type of employer- provided service, and so on. In a correlational study, each characteristic has two identifying numbers that are used to calculate the correlational coefficient (r). A perfect correlation is +1.0 or -1.0. If the characteristics are not related or are only remotely related, the coefficient is closer to 0. While a correlational relationship can be measured, it does not imply a cause-and-effect relationship. Researchers must be careful to avoid claiming causality, even if a correlation close to +1.0 or -1.0 is found. Influence can be present among correlating characteristics, but researchers cannot infer a cause-and-effect relationship based on correlation alone. Consider the following example: The Earth’s atmospheric temperature has demonstrably risen since pirates in tall ships stopped sailing the high seas, but the absence of pirates did not cause the rise in temperatures—even though the correlation is close, if not perfect. Correlational research can describe the homogeneity of heterogeneity of the variables; it can describe the degree to which the variables are intercorrelated by computing the correlational coefficient r. Quasi-Experimental and Experimental Quantitative Research Experimental and quasi-experimental research is used to test a hypothesis and, even further, an intervention involved. An intervention is the main factor in experimental research. To measure the effects of an intervention, the researcher has to identify the variables and discern the comparisons that are going to be made between or within the group(s). Research must make comparisons to examine relationships between dependent and independent variables. Experimental designs have an intervention, a control group, and randomization of participants in the study’s groups. A quasi-experimental design has an intervention, but it has no randomization of participants in the experimental and control groups. Experimental Design Quasi-Experimental Design Intervention X X Control Group X Randomization of Participants X Many experimental research designs measure a dependent variable before and after an intervention, with before and after measurements being the minimum. In a cross-sectional study, data is collected at the before and after points, so a cross-sectional design can work for a project such as a dissertation study. A good experimental or quasi-experimental quantitative research design can aid you in answering the study’s research question at the same time the design reduces threats to the design’s validity. As a researcher, asking and answering the following eight questions can help to address key features of an experimental or quasi-experimental research design. • What is the research question, and will the study entail an intervention? • Rather than staging an intervention, will the researcher observe participants and take measurements? • What are the variables? • When and how often will the researcher collect data or take measurements? • What is the setting for the study? • If the intervention study has multiple groups, how will the researcher randomly assign participants to the groups? RCH 7301, Critical Thinking for Doctoral Learners 4 UNIT x STUDY GUIDE Title • If the study involves humans and an intervention, how will the researcher, participants, and anyone else involved in administrating the study be blinded from knowing the groups to which participants were assigned? • What controls will be put into place to reduce the influence of variables that are not involved in the study? Experimental research designs contain an intervention, so they seek to answer questions about differences (e.g., the difference between an outcome that is measured in both the experimental and the control group). On the other hand, correlational studies look at associations. An experimental study is valid only if the following characteristics are present: • an intervention, where the researcher manipulates the independent variable; • control for the influence of variables not being measured in the study, such as randomization and control groups; and • randomization, where the researcher randomly assigns each participant so that a participant has a 50/50 chance of being assigned to either the intervention or the control group. Randomization is important to deducing the result of the intervention at the end of the experiment. Below, study two tables that present information about statistics that examine differences and associations between and among variables. Name Test statistic Purpose Number of groups Independent samples t-test t Test the difference between the means of 2 independent groups. 2 Paired samples t-test t Test the difference between the means of 2 paired groups (before and after measurements, which are typical paired samples t-tests). 2 One-way analysis of variance (ANOVA) F Test the difference among means of >2 independent groups for one independent variable (that has >1 level). > 2 Two-way analysis of variance (ANOVA) F Test the difference among means for 2 independent variables, where each can have >1 level. > 2 Table 1. Quantitative research design: Statistics that examine differences using an interval/ratio measurement level RCH 7301, Critical Thinking for Doctoral Learners 5 UNIT x STUDY GUIDE Title Name Test statistic Purpose Measurement of dependent variable Pearson product- moment correlation r Measure strength and direction of relationship between 2 variables. Interval/ratio Spearman rank-order correlation ρ Measure the strength and direction of the relationship between 2 variables (nonparametric). Ordinal, interval, or ratio Linear regression Predict the value of a dependent variable, and measure the size of the effect of the independent variable on a dependent variable while controlling for covariates. Interval/ratio Logistic regression This is the same as linear, but it is used when the dependent variable is binary. Binary/dichotomous Table 2. Quantitative research design: Statistics that examine associations Refer to these tables in conference with your mentor and dissertation chair to make decisions about quantitative research designs. References Drummond, K. E., & Murphy-Reyes, A. (2018). Nutrition research: Concepts and applications. Jones & Bartlett Learning. Creswell, J. W., & Creswell, J. D. (2018). Research design: Qualitative, quantitative, and mixed methods approaches (5th ed.). SAGE. Oakshott, L. (2019). Essential quantitative methods: For business, management and finance (7th ed.). Red Globe Press. Course Learning Outcomes for Unit VI Learning Activity Required Unit Resources Unit Lesson Quantitative Research Design Descriptive Quantitative Research Correlational Quantitative Research Quasi-Experimental and Experimental Quantitative Research References 6 Quantitative Research Design Chapter outline · The structure of experimental research · Experimental and quasi-experimental research design · Generalizing from samples to populations · Designing valid and reliable research instruments Keywords · Experimental research · Research questions · Hypotheses · Dependent variables · Independent variables · Descriptive statistics · Inferential statistics · Experimental design · Quasi-experimental design · Sampling · Validity · Reliability Icon Key Read Explore Define Apply Watch Build Practise Discover Author video Chapter objectives After reading this chapter you will be able to: · Describe the experimental and quasi-experimental research approaches. · Formulate appropriate questions and hypotheses. · Identify populations and samples. · Describe the principles of research tool design. A research design is the overarching plan for the collection, measurement and analysis of data. Typically, a research design will describe the purpose of the study and the kinds of questions being addressed, the techniques to be used for collecting data, approaches to selecting samples and how the data are going to be analysed. Define: A priori We saw in Chapter 2 that experimental research methodology usually involves truth-seeking (as opposed to perspective- or opinion-seeking) and may often involve the use of quantitative methods for analysis. It tends, therefore, to utilize a deductive approach to research design, that is the use of a priori questions or hypotheses that the research will test. These often flow from sets of issues and questions arising from the researcher’s engagement with a relevant body of literature, such as marketing, knowledge management or supply chain logistics. The intention of experimental research is the production of results that are objective, valid and replicable (by the original researcher, or by others). In terms of epistemology, then, experimental research falls firmly into the objectivist camp, and is influenced by positivistic theoretical perspectives. It takes, for example, some of the principles of research design (such as the use of experimental and control groups) from the natural sciences. However, given the discredited status of positivism, advocates of the experimental approach are now likely to make more cautious and modest claims for the veracity and status of their research results. In an organizational context, research might stem not from issues prompted by a body of literature, but from a real, live problem the researcher is asked to solve. The initial focus, then, is the problem itself (rising absenteeism, communication bottlenecks, data security, etc.), but the researcher will probably soon have to access both the academic literature (including technical and institutional sources) and also grey literature such as internal organizational documents and reports. Chapter 3 showed how the researcher journeys through a process of refinement, whereby the territory covered by the research literature becomes increasingly focused. But this is not just a question of narrowing the research. The core issues that emerge from the literature gradually build into significant sets of themes, or concerns that link to, and help to specify, the research questions and the research design for solving them. Note that many of the issues discussed in this chapter (for example, the generation of research questions, the identification of samples from populations and issues of validity and reliability ) are also discussed in many of the chapters that follow – even those associated with more qualitative designs. The structure of experimental research The experimental research design process, put simply, comprises two steps: the planning stage and the operational stage (see Figure 6.1 ). At the planning stage, the main issue or research question may be posed and the relevant literature and theories investigated. From these it should be possible (if the issue is capable of being researched) to formulate research hypotheses. The dependent variables (the subject of the research) and independent variables (variables that effect the dependent variable) are identified and made explicit, after which we move into the operational stage. After the experiment has been conducted, the analysis stage may involve the use of both descriptive and inferential statistics (described in Chapter 24 ). From the analysis it then becomes possible to either accept or reject the hypothesis. A formal document or presentation is then prepared to report the results. Let us look at each of these stages in more detail. Figure 6.1 Stages in the planning and operation of an experimental and quasi-experimental research project Source: Adapted from Keppel, Saufley and Tokunaga (1992). Reprinted by kind permission of Macmillan Identifying the issue or questions of interest We saw in Chapter 3 that some of the criteria that make up a ‘good’ research topic include the availability of resources and access to sponsors and other people who may be able to help in the research. Sometimes a research issue may arise from your reading of a body of literature. In a workplace setting, issues or questions spring up as a result of real problems that require a solution, or as a result of a pilot study prior to the implementation of a research project. Reviewing relevant literature and theories As we saw in Chapter 2 , the experimental approach to research is often deductive, so once an area or issue has been chosen for research, the next stage is to identify and delve into a relevant body of literature. Chapter 5 illustrated some of the sources where you might find the literature you need. Early on in your research, you should try to identify the theories that are relevant to addressing your topic, and also what kinds of research methods have been used to address the subject. The literature search will also identify who are the dominant and influential writers in the field. Having looked at the literature, you may decide that the scale of the subject is too large (particularly in terms of your own tight timescales), or that the investigation you were considering has already been done. However, you may also see that previous investigations have been flawed, or that there are gaps in the research that are worth filling. For example, you may become aware of emerging technology-based learning theories, but notice that there have been few studies of their application within the realm of social media (in which you have personal experience). This could be your niche, your experience in the area giving you a head start. Developing questions and hypotheses Research questions and hypotheses are merely the configuration of issues into a transparent and measurable formulation. The way in which research questions are stated, their focus and the kinds of data they seek are strongly connected to the philosophy and research paradigm of the researcher (recall Chapter 2 ). As Wield (2002) also cautions, writing questions and hypotheses is not necessarily a linear process. Even after they have been formulated, either further reading of the literature, or surprises at the piloting or data gathering stages, can force the researcher to amend or even rewrite them. Let us look at research questions and hypotheses in more detail. Constructing research questions The ways in which we formulate key questions can sometimes drive us down unfruitful paths, even when the underlying concerns that motivate our questions are genuine and important. It might help if we reflect for a moment on the genuine concerns that drive us to ask the questions we ask (Sarasvathy, 2004). As Alford (1998) points out, research questions are not the same as problems. Problems, themes and concerns may be allocated to you by a sponsor, or may emerge from your engagement with a relevant body of literature. Alford, however, asserts that, in contrast to a problem, a research question comprises two elements: firstly, a connection to a theoretical framework; secondly, a sentence in which every word counts and which ends (not surprisingly) with a question mark. Questions also describe potential relationships between and among variables that are to be tested. Blumberg et al. (2005) similarly distinguish between what they call dilemmas and research questions. A dilemma is a signal that all is not well – for example, falling sales, higher staff absenteeism or higher borrowing costs. The key is knowing how to turn statements of dilemmas into tight research questions. Table 6.1 offers some examples. Read: Formulating research questions Table 6.1 It is clear from Table 6.1 that each dilemma is addressed by at least one question that explores the relationships between two variables. Kerlinger and Lee (2000) argue that a good research question: · Expresses a relationship between variables (for example, company image and sales levels). · Is stated in unambiguous terms in a question format. But, as Black (2001) states, a question could meet both of Kerlinger and Lee’s criteria and still be invalid, because it may be virtually impossible to operationally define some of its variables. What, for example, do we mean by ‘digital technologies’ (in the above example), and how would we define them in ways that could be measured? As Hedrick et al. (1993) argue, researchers may have to receive sets of questions from research sponsors, and these may be posed by non-technical people in non-technical language. The researcher’s first step, then, is to re-phrase the questions into a form that is both researchable and acceptable to the client. Research questions can be classified into four major categories: · Descriptive (‘What is happening?’, ‘Which methods are being used?’). · Normative (‘What is happening compared to what should happen?’). The standards against which the outcomes are evaluated could include legal requirements, professional standards or programme objectives. · Correlative (‘What is the relationship, and the strength of this relationship, between variable X and Y?’). Note that this establishes a relationship, but it does not imply a cause. · Impact (‘What impact does a change in X have on Y?’). In contrast to correlation studies, impact questions do try to establish a causal relationship between variables. Table 6.2 provides some examples of research questions for each of these categories. It is often useful to take a research question and to break it down into subordinate questions. These are highly specific and assist in answering the question to which they are attached. Taking the first question in Table 6.2 , we might devise a set of subordinate questions such as: · How common is drug misuse among male and female employees? · How does drug misuse compare across different departments? · Has drug misuse increased or decreased over the past five years? This is also a useful exercise because subordinate questions can provide a stage between the original objective and the kinds of detailed questions needed for research tools such as questionnaires and interview or observation schedules. Case Study 6.1 provides an illustration of how research questions often have to be revised and refined before they become sufficiently focused and usable. Table 6.2 Source: Adapted from Hedrick et al., 1993 Case Study 6.1 Getting those research questions right A researcher, working for a voluntary association giving advice to the public, is concerned that most of those seeking the bureau’s help are white, with very few clients coming from the ethnic minority population. She receives a small grant from the bureau’s trustees to carry out a research project. She formulates her research questions as follows: Research questions 1. To produce a report detailing the research. To check if the bureau is conforming to its organizational aims and objectives and if not how it can improve the delivery of services. 2. To increase awareness of the needs of ethnic minority clients and potential clients of the bureau among staff and to inform the organization of staff training needs. 3. To use this as a starting point for further work to be carried out by volunteers at the bureau. Take a look at these research questions. What is wrong with them? Well, to be honest, quite a lot. Question 1 is not really a question but an output. This is what will be produced through the research. Questions 2 and 3 are aims or ambitions. What are listed as research questions do not deserve the description. They may result from the research but are not objectives, since there is nothing here that can be measured. After some thought, the researcher arrives at the following list of questions. 1. What are the needs of ethnic minority groups in the district? 2. What access to information about the bureau do they have? 3. Do those that access the information implement its contents effectively? 4. Is there a relationship between the quality of information given, and ethnic minority trust in the bureau? 5. What degree of awareness should bureau staff have (in relation to their organizational service levels) about the needs of ethnic minority groups? Activity 6.1 Examine the final set of questions in Case Study 6.1 . Which of these research questions belongs to the descriptive, normative, correlative or impact categories? Suggested answers are provided at the end of the chapter. Research questions are formulated as part of many research studies, whether perspective- seeking or truth-seeking, although not necessarily at the same stage of the research. In perspective-seeking studies, for example, questions may emerge as part of the data gathering exercise. For truth-seeking research, including experimental and quasi-experimental research, they are usually formulated at the beginning of the research process. But while perspective-seeking research usually relies just on research questions, truth-seeking approaches usually go further and require the formulation of a hypothesis. Employability Skill 6.1 Setting objectives that are achievable Coming up with a sufficiently focused research question will help you to develop the key employability skill of setting achievable objectives. Your research question must have an achievable objective. Breaking your overall question down into sub-questions will help you to decide whether or not you are being too ambitious in your aims, and whether you should refine your question further. Constructing hypotheses Research questions are usually broad in nature, and may lend themselves to a number of answers, but a hypothesis is capable of being tested and is predictive. For example, ‘How is trust promoted in organizations?’ is a research question and not a hypothesis. To convert the question into a hypothesis we might conjecture that: ‘Emotionally intelligent leadership promotes trust’. Kerlinger and Lee (2000) suggest that a hypothesis is a speculative statement of the relation between two or more variables. Good hypotheses, then, should contain a statement containing two or more variables that are capable of measurement. Measurement, however, can only occur if the variables contained in the hypothesis can be operationally defined (see next section). Certainly, in the above hypothesis, the two variables ‘emotionally intelligent’ and ‘trust’ can each be operationally defined, compared through a research study, and the statement either accepted or rejected. Define: Hypothesis In formulating a hypothesis, care should be taken to avoid what Kerlinger and Lee (2000) describe as value questions, for example those that contain words such as ‘should’, ‘ought’ or ‘better than’. Similarly, the statement ‘The implementation of the new information technology system has led to poor results’ is also a value statement because of the use of the word ‘poor’ – what, exactly, is meant by this? A better approach would be to state the results in measurable terms such as ‘reduced output’, ‘lower staff satisfaction’, or ‘computer error’. It is useful to reflect that negative findings are sometimes just as important as positive ones since they can highlight new lines of investigation. Activity 6.2 Examine each of the following statements and decide which (if any) make valid hypotheses. 1. Using external coaches leads to disappointing levels of employee commitment. 2. What are the major causes of intranet failure? 3. The introduction of a Six Sigma process will increase levels of customer satisfaction. Suggested answers are provided at the end of the chapter. Operationally defining variables One of the problems in formulating research questions and hypotheses is that they tend to be somewhat generalized and vague. Before research tools can be drawn up, it is important to operationally define key variables so it is quite clear what is being measured. Kerlinger and Lee (2000) define an operational definition as something that gives meaning to a construct or a variable by setting out the activities or ‘operations’ that are necessary to measure it. Classifying operational definitions can sometimes be quite challenging. For example, our research question might be: What factors provide the key drivers for ensuring business success in the medium term? As it stands, the question is far too vague to provide a basis for measurement. Returning to the question, we need to operationally define what we mean by ‘business success’: is it output, profitability, cost control or perhaps a combination of all of these? Similarly, what is meant by ‘medium term’? Is it one year, two years, ten years? Going through the process of producing operational definitions allows us the opportunity to rethink some of our assumptions and may even encourage us to rewrite our original research question or questions. Note the loops back to previous stages in Figure 6.1 . Identifying independent and dependent variables Scientific research aims to identify why conditions or events occur. These causes are called independent variables and the resulting effects, dependent variables. A variable is a property that can take different values. Thus, the focus of research might be the introduction of a new performance-related pay system (independent variable) which is designed to lead to greater output (dependent variable). But as Black (2001) warns, relationships between variables may be ones of association , but this does not necessarily imply causality: that is, that changes in one variable lead to changes in another. For example, after the introduction of performance-related pay, output may rise, but this increase may have been caused by completely different factors (for example, better weather or victory by the local football team, each of which might boost morale and hence output). Explore: Independent variables Indeed, independent variables may act upon dependent variables only indirectly via intervening variables . Thus, someone may undertake high-calibre professional training hoping that this will eventually lead to a higher income level. But in practice, the professional training (independent variable) acts upon income level (dependent variable) via its effects on the person’s job prospects (intervening variable, as illustrated in Figure 6.2 ). In addition to this, Figure 6.2 also shows other relationships. For example, it is conceivable that, having achieved a higher level of income, some people may then want to (and be able to afford) more professional training. Watch: Variables in research design In experiments, it is the independent variable that is manipulated to see the effect. So, using the above example of performance-related pay, we might introduce such a scheme into a company and observe the effect on output. But, as has already been suggested, there may be other factors at work that might influence such changes in output. These are termed extraneous variables and must be ‘controlled for’: that is, the study designed in such a way that the impact of extraneous variables does not enter the calculations. Figure 6.2 Illustration of the relationship between dependent, independent and intervening variables There are various ways of controlling for extraneous variables. One is through elimination. So, using our example of performance-related pay, if the study were concerned about the possible influence of current status or grade, we would only choose people from a certain grade for the study. Another way of controlling extraneous variables is through randomization. If randomization is achieved, then it is probable that the experimental groups are equal in terms of all variables. It should be noted, of course, that complete randomization is difficult to achieve in practice. Say, for example, that we know that male and female workers are exactly equally represented in the workforce. If we were to take a random sample of 100 workers, we might expect to finish with 50 men and 50 women. In practice, we often end up with slight variations such as 48 men and 52 women. If gender constitutes the independent variable of interest to the study, we might want to ensure that the groups are equally represented, and randomly select male workers until the numbers reached 50 and likewise for female workers (see stratified random sampling , p. 228). Conducting the study Here begins the operational stage of the research, the success of which depends not only on how the data are gathered, but on how well the study has been planned. While the research strategy (experimental) has been selected, there are still a variety of research designs at the researcher’s disposal (see experimental and quasi-experimental research design, next) and these have to be selected with care. Using descriptive and inferential statistics The data are analysed using a variety of statistical methods, all of which should have been selected at the planning stage. Descriptive statistics are used to describe or summarize a set of data, while inferential statistics are used to make inferences from the sample chosen to a larger population (see Chapter 24 ). Accepting or rejecting hypotheses As we saw in Chapter 2 , it is impossible to ‘prove’ that any theory is right. All theories are provisional and tentative (until disproved). However, the weight of evidence must be sufficient that a hypothesis can be accepted as proved. As we will see in Chapter 24 , experimental design makes use of inferential statistics and probability to calculate the risk involved in accepting the hypothesis as true (when it is in fact false) and rejecting the hypothesis as false (when it is in fact true). Preparing the formal report Particularly when a study has been sponsored or commissioned, the researcher will need to prepare and deliver some kind of formal presentation of the findings. At this stage the focus will be on: · Why the study was conducted. · What research questions and hypotheses were evaluated. · How these were turned into a research design (with sufficient detail that the experiment could be replicated). · What differences were observed between the hypotheses and the results. · What conclusions can be drawn and whether they support or contradict the hypothesis and existing theories. In a more organizational and less academic setting, the formal report will tend to focus on the rationale for the study, the kinds of questions being posed, the results, and what findings, if any, can be implemented. Writing the research report is covered in detail in Chapter 27 . For projects that have received research funding, sponsors usually want to go beyond the report and to be provided with information on how the results of the project will be disseminated. Experimental and quasi-experimental research design The basis of true experimental design is that the researcher has control over the experiment, that is who, what, when, where and how the experiment is to be conducted. This particularly includes control over the ‘who’ of the experiment – that is, subjects are assigned to conditions randomly. So, for example, a local authority might seek to measure whether a refuse recycling programme was effective, or not. Hence, it might run the campaign in several randomly selected areas, but not in others. Where any of the elements of control are either weak or lacking, the study is said to be a quasi-experiment. Often, in organizational settings, for example, for practical purposes it is only possible to use pre-existing groups. Hence, it is only possible to select subjects from these groups rather than randomly assign them (as in a true experimental study). Another important difference is that while in experiments we can manipulate variables, in quasi-experimental studies we can only observe categories of subjects. So, we could consider the differences between two groups to be the independent variable but we would not be manipulating this variable. So, taking our recycling issue mentioned above, we would collect data on recycling indicators across the local authority, and then seek to discover what independent variables might impact on different recycling rates – for example, social class, ethnic group, etc. One of the strengths of experimental design is that randomization improves the control over threats to internal validity . In other words, if the experimental intervention (treatment) does lead to a change in the dependent variable, there is some justification for believing that this has been caused by the treatment itself, and not just by the effect of some extraneous variable. Yet it should not be assumed that random assignment is the goal of all experimental studies. As Hedrick et al. (1993) point out, using an experimental group also means using a control group who do not receive the intervention. Even if the treatment does not prove to be effective, it usually comes with more resources. The control group will be denied these, and for a long period if it is a longitudinal study. For example, in the recycling example, above, those in the control group would not receive any potential benefits of the recycling campaign. This of course can be rectified if they are presented with the campaign benefits but after the study is over. However, this would still be after any benefits enjoyed by those in the experimental group – an institutional review board might not approve such a study because of these time lags. Define: Experimental design One of the strengths of a quasi-experimental design is that it is about as near as one can get to an experimental design, so it can support causal inferences. In the words of Hedrick et al. (1993), it provides ‘a mechanism for chipping away at the uncertainty surrounding the existence of a specific causal relationship’ (1993: 62). Quasi-experimental designs are best used when: · Randomization is too expensive, unfeasible to attempt or impossible to monitor closely. · There are difficulties, including ethical considerations, in withholding the treatment. · The study is retrospective and the programme being studied is already under way. According to McBurney and White (2009), generally, experimental designs are usually considered superior to quasi-experimental (and quasi-experimental to non-experimental). However, it may not always be possible to replicate social, organizational or behavioural conditions in a laboratory setting. Therefore, observation in a field setting, say, might be preferable to an experiment because the advantage of realism outweighs the loss of control. The broad differences between experimental, quasi-experimental and non-experimental studies are summarized in Table 6.3 , and an example of a quasi-experimental design provided in Case Study 6.2 . Image 6.1 A coaching session © iStock.com / kzenon Table 6.3 Case Study 6.2 A quasi-experimental design Leonard-Cross (2010) reports on a research study conducted in a large public sector organization, employing over 3,000 staff in 12 geographical locations. The organization had implemented an accredited coach training programme, offering those in management-level posts the opportunity to undertake a coaching qualification and then coach fellow employees. The study sought to evaluate the impact of the programme on those who had received coaching. To do this, a quasi-experimental design was adopted with participants in the survey either in a coached or non-coached group (the latter randomly selected). The researcher had no control over group allocation since membership of the coached group depended on whether participants had taken part in the coaching programme over the last two years – hence the quasi-experimental nature of the design. The non-coached staff (control group) were matched to the coached staff based on geographical location and job type and were randomly selected by contacts in each geographical location who had no additional knowledge of the research. The study found that participants that had received developmental coaching (N = 61) had higher levels of self-efficacy than the control group of participants (N = 57) who had not received coaching. Source: Leonard-Cross, 2010 Activity 6.3 Taking Case Study 6.2 , explain: 1. Why this is a quasi-experimental rather than an experimental study. 2. Why the non-coaching (control) group were matched against the coached group based on geographical location and job type. Suggested answers are provided at the end of the chapter. Let us take a look at a number of research designs, starting with frequently used (but faulty designs) and then some sound designs. Faulty designs to avoid Design 1: Non-experimental with intact group In this design, an intact group is taken and attempts made to discover why changes in an independent variable occurred. There is no attempt made here to manipulate any independent variables – hence the design is non-experimental (see Table 6.4 ). Say that a voluntary organization analyses its charitable donation patterns over the past three years by geographic region. The dependent variable is the level of charitable donations for each region. The independent variable is not manipulated but is imagined. In other words, researchers would conduct a study that would try to find explanations for any regional differences, perhaps using documentary evidence. Clearly, the problem here is providing convincing evidence of causation – that a particular independent variable caused the changes in the dependent variable. In their influential work, Campbell and Stanley (1963) describe designs that are devoid of a control group as being of almost no scientific value. This is not to say that they are completely worthless. Each design might reveal some interesting evidence of value to an organization, but they are worthless in the sense that it would be a mistake to draw firm conclusions from them. Table 6.4 Design 2: Post-test only with non-equivalent control groups In this type of design, a treatment is given to one group (the experimental group), but not to another (the control). Both groups are then given a post-test to see if the treatment has been effective (see Table 6.5 ). Unfortunately, subjects have not been randomly allocated between the experimental and control groups, so it is impossible to say that the two groups are equivalent. If, say, the experimental group performs better in the test, it is not possible to rule out the possibility that this was because the subjects in this group were more able or better motivated. Say, for example, that in a training setting, one group of participants is given coaching to improve their interpersonal skills, but a control group does not receive the coaching. Both take a post-test but the control group does better! This may be because there was no random allocation of subjects (both groups were taken intact) and it so happens that there are more able participants in the control group (or some had received coaching in the past). Table 6.5 Design 3: One group, pre-test/post-test In Design 3, a group is measured on the dependent variable by a pre-test , an independent variable is introduced, and the dependent variable measured by a post-test (see Table 6.6 ). So, an organization could measure staff attitudes towards racial tolerance, introduce a race-awareness programme, and measure staff attitudes once the programme was completed. Any change in attitudes would be measured by changes in scores between the two tests. Table 6.6 This design is an improvement on Design 1 as it appears that any changes in attitude could be attributed to the impact of the treatment – the attitude training. Unfortunately, as Campbell and Stanley (1963) point out, there are other factors that could have affected the post-test score. These can impact on the experiment’s internal validity, that is the extent to which we can be sure that experimental treatments did make a difference to the independent variable(s). Such factors include: · Maturation effects: people learn over time, which might affect scores on both mental ability and attitude, or they may grow more fatigued over time, which may also affect their post-test scores. · Measurement procedures: the pre-test itself might have made the subjects more sensitive to race issues and influenced their responses on the post-test. Both controversial and memory issues are prone to be influenced in this way. · Instrumentation: in which changes, say, in the observers or scorers used to assess the test results may affect the scores obtained. · Experimental mortality : or the differential loss of respondents from one group compared to the other, for example through absence, sickness or resignations. · Extraneous variables might influence the results, particularly if there is a large time gap between the pre-test and post-test. Some sound designs McBurney and White (2009) state that there is no such thing as a perfect experiment. Nevertheless, there are two elements of design that provide some control over threats to validity and which form the basis of all sound experimental designs: (a) the existence of a control group or a control condition; (b) the random allocation of subjects to groups. Some of the principles of random assignment are explained in the following Web link. Watch: Random assignment Go Online 6.1 Watch the following video clip to understand what a random assignment is. The URL for the video clip can be accessed via the companion website: · http://www.youtube.com/watch?v=V_GIjFw6RZE Design 4: Experimental group with control In this design, subjects are randomly assigned to each of the experimental and control groups, which means that, at least theoretically, all independent variables are controlled (see Table 6.7 ). Hence, again using our racial tolerance example, the study would randomly assign groups of people to both the experimental and control groups. The experimental group would receive the treatment (the race-awareness training) while the control group would not receive the training. Notice that any extraneous variables, such as the effects of the pre-test on attitudes, would be controlled for, since the impact should be the same on both the experimental and control groups. If the training has been genuinely successful, then the improvements in test scores for the experimental group should exceed those for the control group. Image 6.2 An experimental group receiving training and a control group © iStock.com / Geber86 © iStock.com / shironosov Table 6.7 Design 5: Quasi-experimental design with non-equivalent control Recall that one of the features of quasi-experimental designs is that it is not possible for the researcher to control the assignment of subjects to conditions, and s/he will often have to take groups that are intact (see Table 6.8 ). For example, studies of professional development will often have to use training groups that already exist. A typical feature of quasi-experiments is where we have an experimental and a control group, but subjects have not been randomly allocated to either of the two groups. Table 6.8 The use of a control group makes this design superior to Designs 1, 2 and 3, since at least the impact of extraneous variables is controlled for, but not as reliable as Design 4. If steps can be taken to improve the equivalence between the two groups then this will improve the validity of the study. Matching, for example, will help in this direction. Here, steps are taken to match subjects between groups against significant variables such as age, sex, income, etc. If matching is not possible, then at least both groups should be chosen from the same population. So, for example, if we are investigating the impact of an incentives package on job performance, we would want to match the experimental (incentives) group and control (non-incentives) group against key variables such as age, work role and seniority. One of the challenges of using a non-equivalent control group design is in the analysis of the results. McBurney and White (2009) distinguish between desired result patterns and those that it is impossible to interpret. In pattern A ( Figure 6.3 ), for example, both the experimental and control groups exhibit the same performance in a pre-test, but only the experimental group improves its performance in the post-test. Although the experimental and control groups are not equivalent, their performances can be compared because their behaviour was the same at the beginning. A similar situation pertains for pattern B – the experimental group performed worse than the control group in the pre-test but improved in the post-test, with the control showing no improvement. It would be difficult to find a reason as to why this process had occurred by chance alone. Pattern C, however, is much harder to interpret. Although it is true that the performance of the experimental group has improved, the lack of improvement by the control group may be due to the ceiling effect – they began by being better than the experimental group and it may not be possible to improve on this level of performance. Hence, it cannot be deduced that the improvement in the experimental group was due to the treatment. In pattern D the performance of both the experimental and control groups has improved, with the experimental group improving to a higher level. At first sight this might appear to be a significant result but a claim for this would be mistaken since both groups have improved their performance by the same proportion. Figure 6.3 Interpretable and uninterpretable patterns of results in a non-equivalent control group design with pre-test and post-test Source: Republished with permission of South-Western College Publishing, a division of Cengage Learning, from McBurney, D.H. and White, T.L. (2009) Research Methods, 8th edn. Belmont, CA: Wadsworth; permission conveyed through Copyright Clearance Center, Inc. Design 6: Developmental designs Like interrupted time-series designs, developmental designs involve measurement across time and, again, do not involve the use of control groups. One kind of developmental design is the use of a cross-sectional study, which looks at a phenomenon at a particular period of time. For example, a cross-sectional design might study the determinants of accidents in an organization. A survey might be used to calculate an average number of days lost in accidents per employee. The next stage of the survey might examine accident rates by age group, gender, occupational role and seniority. One of the advantages of cross-sectional design is that it can reveal associations among variables (age, gender, etc.). But what it cannot do is reveal causation. To achieve this, we would have to turn to a longitudinal study, taking a series of samples over time. The problem here, however, is that it may be difficult to gain access to the same set of people over a long period. Indeed, even different sets of researchers may have to be employed. Design 7: Factorial designs The designs we have considered so far have involved manipulation or change in one independent variable. Sometimes, however, it becomes necessary to investigate the impact of changes in two or more variables. One reason for this could be that there is more than one alternative hypothesis to confirm or reject. Another reason might be to explore relationships and interactions between variables. Here we use a factorial design which allows us to look at all possible combinations of selected values. The simplest form is where we have two variables, each of which has two values or levels. Hence, it is known as a two-by-two (2 × 2) factorial design. In Figure 6.4 , for example, the two variables are light and heat, each of which has two levels (cold/hot and dull/bright). Hence, we have four possible combinations, as illustrated. We could conduct an experiment to see which combination of factors gives rise to the most attentiveness (measured, say, by production levels, or on a self-assessment questionnaire) in a workplace. We might find, for example, that dull light combined with both heat and cold leads to low levels of attentiveness, as do bright/hot conditions; but the interaction of brightness with cold temperatures keeps all workers ‘on their toes’! Figure 6.4 A 2 × 2 factorial design showing all possible combinations of factors Generalizing from samples to populations A typical human trait is to make generalizations from limited experience or information. For example, we may ask members of staff what they think of the new company environmentally friendly transport policy. We may infer that this could be the opinion throughout the organization, the entire workforce constituting what in research terms is known as the population. A population can be defined as the total number of possible units or elements that are included in the study. If it is not possible to evaluate the entire population (because of its large size or a lack of research resources), then we might select a sample of employees for evaluation. According to Fink, ‘A good sample is a miniature of the population – just like it, only smaller’ (2002a: 1). Define: Population Top Tip 6.1 The word ‘population’ can often cause some confusion. When we use this word in research methods we do not usually mean the population of a country. In research, a population refers to a group that has something in common – for example, Glasgow human resource managers, Berlin bar owners or Parisian journalists. The process of selecting samples A sample will be chosen by a researcher on the basis that it is a representative sample of the population as a whole, that is the sample’s main characteristics are similar or identical to those of the population. Samples are selected from a sampling frame , that is a list of the population elements (see Figure 6.5 ). Notice that, while every attempt will be made to select a sampling frame that provides details of the entire population, practical circumstances may make the sampling frame incomplete. For example, the population may comprise all people working in airport security over a weekend, but the human resources records may have missed out some staff by mistake, whilst new starters have not even been entered onto the database yet. The research sample itself might be less than the sampling frame just because using all sampling frame records is too expensive. But having established the sampling frame and how many people we are going to use, how do we choose them? Read: Representative samples Most methods utilized to achieve representative samples depend, in some way, on the process of random assignment. Random probability sampling is the selecting of a random sample such that each member of the population has an equal chance of being selected. Clearly, this can present practical problems. Can we, for example, acquire a full list of company employees from which to draw the sample (the sampling frame)? But as Black (2001) warns, even after taking a random sample, there remains a finite possibility that it may not be representative of the population after all. The chances of this happening are reduced if the study can be replicated: that is, other random samples are used and studied. Nevertheless, the chances of a sample being representative are higher through random selection than if the sample is purposive (specifically selected by the researcher). Figure 6.5 Relationship between the population, sampling frame and sample Of course, we may not always want to attain completely random samples. Again using the simple example of gender, a factory workforce of 100 people might comprise 90 women and 10 men. A random sample of 25 people might give us 23 women and 2 men. Clearly, if gender is the independent variable, a sample of 2 men would probably be of little value to the study. In this case, we might use stratified random sampling by deciding to randomly sample female workers until 15 are picked and follow the same strategy but oversample for men until we have a sample of 10. Let us look at some of the approaches to achieving representativeness in samples. What size sample should we use? The first stage is to determine the actual size of the sample needed. Before doing this, we need to decide on the size of the confidence interval . This is the range of figures between which the population parameter is expected to lie. Say we set the confidence interval at 4 per cent, and 45 per cent of the population pick a particular answer. This means that we are saying that we are confident that between 41 per cent (45 – 4) and 49 per cent (45 + 4) of the entire population would have picked that answer. We also decide on a confidence level, usually of either 95 per cent or 99 per cent. This states the probability of including the population mean within the confidence interval. This is chosen before working out the confidence interval. In many studies, a confidence level of 95 per cent is often deemed sufficient. In medical research, a level of 99 per cent is usually taken because of the need to be highly confident of estimates. Experimenting with Activity 6.4 should make this clearer. Define: Confidence intervals Go Online 6.2 To calculate the size of sample you need from a given size of population click on the following link: http://www.surveysystem.com/sscalc.htm Explore: Sample size calculator Selecting random samples Having estimated the size of sample you need, you can now go about randomly selecting it. As we have seen, randomization is the process of assigning subjects to experimental and control groups such that the subjects have an equal chance of being assigned to either group. The process of random selection can be accomplished either by using the appropriate statistical table (see Table 6.9 ) or by using a special computer program (see Activity 6.4 ). Say you have acquired a list of 1,000 of the company’s staff from which you want to randomly select 50 as your research sample. First, ascribe a number to each staff member on the list. Then, using a pencil, close your eyes and point to part of the table. If you happen to select, say, 707, the top number of the third column ( Table 6.9 ), take the first two numbers, 70, and work down your list of random numbers in the table to the 70th. Hence, your first number is 799. Then, using the last digit from 707 and the first digit of the next three-digit figure, 872, you get 78. Select the 78th position down the list, which gives you 343. Go back to the number 872 and choose the last two digits of that number, 72, and take the 72nd number from the table, etc. Repeat this process until 50 names have been selected. Now take a look at the Web randomizer ( Activity 6.4 ) – you may find it easier! Table 6.9 Source: Adapted from Black, 2001 Activity 6.4 Your sample comprises 100 people from whom you want to randomly select 10 as your sample. All people are allocated a number from 1 to 100. You now want to produce a set of 10 random numbers ranging from 1 to 100. In your Web browser, go the following address: · http://www.randomizer.org Click on [Randomizer] then respond as follows to the questions presented: · How many sets of numbers do you want to generate? = 1 · How many numbers per set? = 10 · Number range = 1 to 100 · Do you wish each number in a set to remain unique? = Yes · Do you wish to sort your outputted numbers (from least to greatest?) = Yes Click on [Randomize Now!] You should see a set of 10 random numbers arranged in a row. Types of random sample In an ideal world, you would have sufficient time and resources to choose completely random samples. In the real world, due to practical constraints, you may have to choose other types of sampling techniques. In quantitative research, random samples are usually preferable to non-random. Given the importance of sampling in research design (both quantitative and qualitative designs), Chapter 9 , Sampling Strategies in Business, is entirely devoted to this theme. Top Tip 6.2 Research students often agonize about the need to select a random sample. Indeed, even when using non-random samples, they can become tempted to make claims that the sample was somehow randomly selected. This is misguided for a number of reasons. Firstly, these kinds of studies (especially when undertaken for the purpose of writing a thesis or dissertation), for practical purposes, often work with fairly modest sample sizes, meaning that the ability to generalize is limited. Secondly, when working with such modest samples, it is the quality (representativeness) of the sample that becomes more important rather than the size. Generalizing from samples to populations One of the objectives of experimental research is to achieve a situation where the results of a study using a sample can be generalized. According to Kerlinger and Lee (2000), generalizing means that the results of a study can be applied to other subjects, groups or conditions. Generalizing means that the fruits of research can have a broader application than merely being limited to a small group. For example, say that researchers evaluated a staff development programme in which staff were taught to adopt new health and safety practices to reduce accident rates. If the study showed that scores for the trained group were significantly better than for a control group, then the results might be of relevance to other health and safety policy makers. On the other hand, just because a study does not find results that are capable of generalization does not mean they have no relevance. A small case study, for example, may produce findings that are interesting and possibly indicative of trends worthy of replication by further research. And from a perspective-seeking view they may be seen as valid in their own right. The important point is that you should not make firm or exaggerated claims on the basis of small, unrepresentative samples. Build: Sampling in business Employability Skill 6.2 Understanding the strengths and weaknesses of selected sampling design When businesses make decisions, they want to be sure that the data they use are trustworthy and can be relied on. If sampling is involved (which it often is), then the type of sampling strategy used and its strengths and weaknesses need to be understood and taken into account in interpreting data. Designing valid and reliable research instruments We have looked, so far, at some of the general principles of research design, including the use of experimental and control groups and the selection of representative samples so that results can be generalized to a larger population. However, for defensible statistical inferences to be made on the basis of the data, any research tools used (such as questionnaires, interview schedules and observation schedules) must be internally valid and reliable. To achieve external validity, such instruments must be designed in such a way that generalizations can be made from the analysis of the sample data to the population as a whole. Watch: Using quantitative methods This section deals with some of the general principles of validity and reliability, but these important issues are taken up in more detail when describing the design of specific data collection tools in later chapters. Principles of validity To ensure validity, a research instrument must measure what it was intended to measure. This may sound like an obvious statement, but many novice researchers make the mistake of asking spurious questions in a misguided attempt to collect as much data as possible – just in case some data may be needed at the analysis stage! For example, a bank survey might seek to measure customer attitudes towards the counter services it provides, but the data gathering instrument might (erroneously) stray into asking about their attitudes to new financial products. This might be important, but not relevant to the study itself. In discussing validity, McBurney and White (2009) pose the interesting analogy of using a measurement of hat size to determine intelligence. You could measure someone’s hat size, say, every hour and always come up with the same result. The test, then, is reliable. However, it is not valid, because hat size has nothing to do with what is being measured. In Figure 6.6 we can see that only part of the research instrument covers the subject areas that have been operationally defined. Some operationally defined subjects have not been addressed by the instrument (Zone of Neglect), while other parts of the instrument cover issues of no direct relevance to the research study at all (Zone of Invalidity). To achieve validity, the research instrument subject area and operationally defined subject areas must exactly match (Zone of Validity). The issue of validity, however, is much more complex than this. The central question around validity is whether a measure of a concept really measures that concept – does it measure what it claims to measure? So, for example, do IQ tests really measure intelligence? Do formal examinations measure academic ability? At a basic level, validity can be defined as eight types: face , internal , external , criterion , construct , content , predictive and statistical validity . We will look at each in turn. Figure 6.6 Relationship between research instrument and operationally defined subject areas and the issue of validity Face validity When developing a new research instrument (such as a questionnaire), it is vital that it is able to demonstrate at least face validity otherwise all is lost. Face validity means that the instrument at least appears to measure what it was designed to measure. But how do we demonstrate such face validity? For a start, it is up to the researchers to study their own instrument and critically evaluate what they have produced. Because they are so ‘close’ to their own work, the next step is to get other people to comment, particularly if they are subject experts in relation to the concept being measured. However, as McBurney and White (2009) warn, face validity is not an end in itself. A test may have a high or low degree of validity regardless of whether it has face validity or not. Define: Face validity Top Tip 6.3 In the event that you do not have ready access to relevant subject matter experts, the next best step is to ask friends or colleagues to evaluate the instrument. Make it clear to them what the instrument is meant to measure and that you want a critical appraisal. Internal validity Internal validity refers to correlation questions (cause and effect) and to the extent to which causal conclusions can be drawn. If we take, for example, an evaluation of the impact of a product promotion campaign, one group receives the promotional material (the experimental group) while one does not (the control group). Possible confounding variables are controlled for, by trying to make sure that participants in each group are of similar ages and educational attainment. Internal validity (the impact of the campaign) may be helped by testing only those who are willing to participate in the experiment. But this reduces the completely random nature of the experimental group and hence the external validity of the study (see next). Define: Interval validity External validity This is the extent to which it is possible to generalize from the relationships found in the data within the sample’s experimental subjects to a larger population or setting (Cook and Campbell, 1979). Clearly, this is important in experimental and quasi-experimental studies where sampling is required and where the potential for generalizing findings is often an issue. As Robson (2002) points out, the argument for generalization can be made by either direct demonstration or by making a case. The problem of generalizing from a study is that cynics can argue that its results are of relevance only to its particular setting. Direct demonstration, then, involves carrying out further studies involving different participants and in different settings. If the findings can be replicated (often through a series of demonstrations), then the argument for generalizing becomes stronger. Making a case simply involves the construction of a reasoned argument that the findings can be generalized. So, this would set out to show that the group(s) being studied, or the setting or period, share certain essential characteristics with other groups, settings or periods (Campbell and Stanley, 1963). Define: External validity Criterion validity This is where we compare how people have answered a new measure of a concept, with existing, widely accepted measures of a concept. If answers on the new and established measures are highly correlated, then it is usually assumed that the new measure possesses criterion validity. However, as de Vaus (2002) suggests, a low correlation may simply mean that the old measure was invalid. Furthermore, many concepts have no well-established measures against which to check the new measure. Hence, Oppenheim (1992) is probably correct to state that good criterion measures are notoriously hard to find. Define: Criterion validity Construct validity Construct validity is concerned with the measurement of abstract concepts and traits, such as ability, anxiety, attitude, knowledge, etc., and is concerned with whether the indicators capture the expected relationships among the concepts being researched (Cook and Campbell, 1979). As we saw above, each of these traits has to be operationally defined before it can be measured. Taking each trait, the researcher proceeds to elaborate on all of the characteristics that make up that trait. For example, if we use the construct ‘confidence’ within a particular research context this might be defined as: · The ability to make quick decisions. · Sticking with personal decisions once these are made. · Strong interpersonal skills. Define: Construct validity You might reflect here that, in fleshing out traits to this level of detail, it is only a relatively short step to the creation of a research instrument like a questionnaire. While a test that has construct validity should measure what it intends to measure, it is equally important that it should not measure theoretically unrelated constructs (McBurney and White, 2009). So, for example, a test designed to measure attitudes to change should not contain items that seek to measure, say, extraversion. Content validity Content validity is associated with validating the content of a test or examination. Since it is important to create a match between what is taught and what is tested, this might include comparing the content and cognitive level of an achievement test with the original specifications in a syllabus. Let us take the case of a computer company that provides a training programme in fault finding and rectification for those retail companies that sell its products. After a two-day training programme, participants are given a 50-question multiple-choice test. The computer company will want to ensure that the content of the test is matched with the content of the training programme so that the entire syllabus is covered, and only issues that have been taught are assessed. Equally, it will want to assure itself that it has delivered the training programme at a level so that attendees learn the skills of problem solving. The assessment, then, will also have to be at this problem-solving level (rather than, say, merely applying rules, or recalling facts) for the test to be valid. Define: Content validity Predictive validity This shows how well a test can forecast a future trait such as job performance or attainment. It is no use if a test for identifying ‘talent’ in an organization has both construct and content validity but fails to identify, say, those who are likely to be ‘high performers’ in a key work role. Define: Predictive validity Statistical validity This is the extent to which a study has made use of the appropriate design and statistical methods that will allow it to detect the effects that are present. Principles of reliability According to Black (1999) reliability is an indication of consistency between two measures of the same thing. These measures could be: · Two separate instruments. · Two similar halves of an instrument (for example, two halves of a questionnaire). · The same instrument applied on two occasions. · The same instrument administered by two different people. If we were to take another sort of measuring device, a ruler, how sure could we be that it is always a reliable measure? If it is made of metal, does it expand in extreme heat and therefore give different readings on hot and cold days? Alternatively, we might use it on two different days with similar temperatures, but do we mark off the measurement of a line on a piece of paper with the same degree of care and accuracy? For a research tool to be reliable we would expect it to give us the same results when something was measured yesterday and today (providing the underlying trait(s) being measured has not changed). Similarly, we would expect any differences found in traits between two different people to be based on real differences between the individuals and not be due to inconsistencies in the measuring instrument. Reliability is never perfect and so is measured as a correlation coefficient . In the social and business sciences it is rarely above 0.90. If a research instrument is unreliable, it cannot be valid. Like validity, there are several ways of measuring reliability. Black (2001) describes five of them. Define: Stability coefficient Stability This measures the scores achieved on the same test on two different occasions. Any difference is called subject error . For example, a survey of employee attitudes towards their workplace may yield different results if taken on a Monday than on a Friday. To avoid this, the survey should be taken at a more neutral time of the week. Equivalence Another way of testing the reliability of an instrument is by comparing the responses of a set of subjects with responses made by the same set of subjects on another instrument (preferably on the same day). This procedure is useful for evaluating the equivalence of a new test compared to an existing one. Internal reliability This measures the extent to which a test or questionnaire is homogenous. In other words, it seeks to measure the extent to which the items on the instrument ‘hang together’ (Pallant, 2013; Sekaran and Bougie, 2013). Are the individual scale items measuring the same construct? Internal reliability is measured by Cronbach’s alpha test, which calculates the average of all split-half reliability coefficients. An alpha coefficient varies between 1 (perfect internal reliability) and 0 (no internal reliability). As a rule of thumb a figure of 0.7 or above is deemed acceptable. However, as Pallant (2013) warns, Cronbach’s alpha results are quite sensitive to the number of items on a scale. For short scales (with items fewer than 10) it can be quite common to find Cronbach values as low as 0.5. Inter-judge reliability Inter-judge reliability compares the consistency of observations when more than one person is judging. An example would be where two people judge the performance of a member of an organization’s marketing staff in selling a product over the telephone to the public. The reliability of the observation is provided by the degree to which the views (scores) of each judge correlate. Observer error can be reduced by using a high degree of structure to the research through the use of a structured observation schedule or questionnaire. Intra-judge reliability Where a large amount of data has been collected by a researcher over time the consistency of observations or scores can be checked by taking a sample set of observations or scores and repeating them. A further problem, and often a significant one, is bias on the part of respondents. It is quite common, for example, for respondents to provide a response they think the researcher is seeking. Particularly if the researcher is seen to be representing ‘management’, respondents may be reluctant to provide honest answers if these are critical of the organization. Even assurances of confidentiality may not be enough to encourage complete honesty. Explore: Getting truth in surveys Top Tip 6.4 A useful way of measuring inter-judge reliability is through use of the Kappa score (recall the calculation in Chapter 5 ), which compares the level of agreement between two people against what might have been predicted by chance. Activity 6.5 An organization representing the interests of small businesses in London plans to conduct a survey to measure business optimism, normally a guide to future investment intentions and economic growth. The aims of the survey are to: (a) measure the current level of business optimism and compare this with levels over the last three years; (b) measure the causes of this level of optimism; (c) establish links between levels of optimism and business intentions, such as hiring new employees and investment plans. There are insufficient financial resources to send the questionnaire to all businesses in London so you must select a sample. 1. What is the population for this research? 2. What is the sampling frame? 3. What kind of sample will you select? Justify your choice. 4. Identify dependent and independent variables. 5. Produce an appropriate research design. 6. Using the aims outlined above, construct a valid and reliable research instrument. Suggested answers are provided at the end of the chapter. Summary · The structure of experimental research generally comprises two stages: the planning stage and the operational stage. · Experimental research begins from a priori questions or hypotheses that the research is designed to test. Research questions should express a relationship between variables. A hypothesis is predictive and capable of being tested. · Dependent variables are what experimental research designs are meant to affect through the manipulation of one or more independent variables. · In a true experimental design the researcher has control over the experiment: who, what, when, where and how the experiment is to be conducted. This includes control over the who of the experiment – that is, subjects are assigned to conditions randomly. · Where any of these elements of control are either weak or lacking, the study is said to be a quasi-experiment. · In true experiments, it is possible to assign subjects to conditions, whereas in quasi-experiments subjects are selected from previously existing groups. · Research instruments need to be both valid and reliable. Validity means that an instrument measures what it is intended to measure. Reliability means that an instrument is consistent in this measurement. Review questions 1. The use of control groups is essential in quantitative research designs. Do you agree with this view? 2. Studies that make use of descriptive (but not inferential) statistics are of limited value. Discuss. 3. Pre-test/post-test quantitative designs have been criticized. What practical steps can be taken to address the limitations of such a design if it is the only one available? 4. Should generalization always be the goal of quantitative research? Research Action 6.1: Designing a quantitative project If you have decided on a quantitative approach, take the following steps, referring to the detailed checklist at the end to ensure you have not overlooked anything: 1. Revisit your ideas for possible research questions. Is it possible to convert these into hypotheses containing two or more variables that can be measured and operationally defined? Break down your questions into subordinate questions which will help to determine the research tools you use (for example, questionnaires/interviews). 2. Revisit your planned statistical methods. Are they descriptive or inferential? 3. Decide between experimental and quasi-experimental research design and write down reasons for your choice. 4. Decide on your experimental and control group. 5. Think carefully about your sampling; choose your sampling frame and the size of your sample and consider whether it is going to be possible to generalize from your sample. 6. Think about your selected research tools in light of validity and reliability. Your Research Project Checklist Next Steps In Your Research Project Push your project forward with a host of resources available to you online: · Watch videos to build your understanding of important concepts · Read journal articles to deepen your knowledge and reinforce your learning of key topics · Discover case study examples that help you to gain insight into real research in the business world Watch: Designing an experiment Read: Video game methods Discover: Quantitative methods in HRM If you are using the interactive eBook, just click on the icons in the margin to access each resource. Alternatively, go to: https://study.sagepub.com/graybusiness2e Further reading Creswell, J.W. (2017) Research Design: Qualitative, Quantitative, and Mixed Methods Approaches, 5th edn. Thousand Oaks, CA: Sage. Although written with a broad spectrum of research designs in mind, the book provides useful guidelines on writing research questions and hypotheses and on quantitative methods design. Kerlinger, F.N. and Lee, H.B. (2000) Foundations of Behavioural Research, 4th edn. Fort Worth, TX: Harcourt College Publishers. Excellent on the pros and cons of various experimental designs and on quantitative research design in general. McBurney, D.H. and White, T.L. (2012) Research Methods, 9th edn. Belmont, CA: Wadsworth. Written from a psychology perspective, this book provides a useful, largely quantitative approach to some of the principles of research design. Journal resources Aguines, H. and Bradley, K.J. (2014) ‘Best practice recommendations for designing and implementing experimental vignette methodology studies’, Organizational Research Methods, 17(4): 351–371. Describes and recommends the use of experimental vignette methodology as a way of exercising control over independent variables in situations where this is difficult. Highhouse, S. (2009) ‘Designing experiments that generalize’, Organizational Research Methods, 12(3): 554–566. Discusses how research can be better designed to go beyond ‘mundane realism’ (superficial resemblance to the real world) to better design treatments. Nyhan, R.C. and Marlowe, J.R. (1997) ‘Development and psychometric properties of the Organizational Trust Inventory’, Evaluation Review, 21(5): 614–635. Demonstrates the process of scale development to produce a quantitative scale that is both valid and reliable. Pearson, A.W. and Lumpkin, G.T. (2011) ‘Measurement in family business research: How do we measure up?’, Family Business Review, 24(4): 287–291. Discusses the importance of construct validity and reliability in business research. Offers guidelines for the development of multi-item measures. Suggested answers for Activity 6.1 1. Descriptive. 2. Descriptive. 3. Impact. 4. Correlation. 5. Normative. Suggested answers for Activity 6.2 1. Not a good hypothesis, since it contains the subjective word ‘disappointing’. The statement should contain a parameter capable of measurement. 2. This is a research question (to which there could be a variety of answers), not a hypothesis capable of being tested. 3. A good hypothesis since it is testable. Levels of patient satisfaction can be measured and we can see whether levels increase, decrease or stay the same. Suggested answers for Activity 6.3 1. This is a quasi-experimental study because there was no opportunity to randomly assign subjects to the condition (the coaching group). 2. The non-coached staff (control group) were matched to the coached staff based on geographical location and job type so as to control for these extraneous variables. For example, the proportions of coached and non-coached staff for each geographical region were kept approximately the same. Suggested answers for Activity 6.4 1. The population comprises all small businesses in London. 2. The sampling frame consists of the organization’s extensive (but probably incomplete) database of members. Some sizes of business (for example, medium-sized) may be more represented in the sampling frame than, say, micro-businesses (less than 10 employees) because they can afford the membership fees. 3. One approach would be to take a completely random sample by allotting a number to businesses in the organization’s database of members. However, it might be hypothesized that certain businesses, for example medium-sized companies, might have a greater impact on employment and growth. Hence, an alternative approach would be to take a purposive sample that focuses more heavily on this size of business. The results might highlight the perceptions of these businesses, but could not be claimed to be representative of London small businesses as a whole. 4. The independent variable is business optimism. There are many potential dependent variables but some might include hiring intentions, capital expenditure plans, the development of new products or services, innovation, etc. Don’t forget to click on the icons throughout the chapter to access the supporting resources: You can also access these digital resources at: https://study.sagepub.com/graybusiness2e 4 Analysing and Presenting Quantitative Data Chapter outline · Categorizing data · Data entry, layout and quality · Presenting data using descriptive statistics · Analysing data using descriptive statistics · The process of hypothesis testing: inferential statistics · Statistical analysis: comparing variables · Statistical analysis: associations between variables Keywords · Categorizing data · Data entry · Descriptive statistics · Distributions · Hypotheses · Inferential statistics · Significance · Correlation analysis · Regression Icon Key Read Explore Define Apply Watch Build Practise Discover Author video Chapter objectives After reading this chapter you will be able to: · Prepare quantitative data for analysis. · Select appropriate formats for the presentation of quantitative data. · Choose the most appropriate techniques for describing data (descriptive statistics). · Choose and apply the most appropriate statistical techniques for exploring relationships and trends in data (correlation and inferential statistics). As we have seen in previous chapters, the distinction between quantitative and qualitative research methods is often blurred. Take, for example, survey methods. These can be purely descriptive in design, but, on the other hand, the gathering of respondent profile data provides an opportunity for finding associations between classifications of respondents and their attitudes or behaviour, providing the potential for quantitative analysis. One of the essential features of quantitative analysis is that, if you have planned your research tool, collected your data and now you are thinking of how to analyse them – you are too late! The process of selecting statistical tests should take place at the planning stage of research, not at implementation. This is because it is so easy to end up with data for which there is no meaningful statistical test. Robson (2002) also provides an astute warning that, particularly with the aid of the modern computer, it becomes much easier to generate elegantly presented rubbish, reminding us of GIGO – Garbage In, Garbage Out (Robson, 2002). The aim of this chapter is to introduce you to some of the basic statistical techniques. It does not pretend to provide you with an in-depth analysis of more complex statistics, since there are specialized textbooks for this purpose. It is assumed that you will have access to a computer and an appropriate software application for statistical analysis, particularly IBM SPSS Statistics. Note that in this chapter, rather than offer you Activities, Worked Examples using statistical formulae will be provided. In some cases, these will be supported with data that you can access by clicking on the icons in the margin or by visiting the book’s online resources. These datasets allow you to apply some statistical tests to ‘real’ data. Watch: Basic statistics Top Tip 24.1 If you are relatively new to statistics, try to get access to someone more experienced than yourself to act as a guide or mentor. Also, of course, if you have an academic supervisor, ensure that you maintain regular contact and ask for advice. As suggested in Chapter 23 , there are also many useful online tutorials on statistics on YouTube. If you are new to statistics, you might find it helpful if you add the word ‘basic’ to ‘statistics’ in the YouTube search engine. Categorizing data The process of categorizing data is important because, as was noted in Chapter 6 , the statistical tests that are used for data analysis will depend on the type of data being collected. Hence, the first step is to classify your data into one of two categories, categorical or quantifiable (see Figure 24.1 ). Categorical data cannot be quantified numerically but are either placed into sets or categories (nominal data) or ranked in some way (ordinal data). Quantifiable data can be measured numerically, which means that they are more precise. Within the quantifiable classification there are two additional categories of interval and ratio data. All of these categories are described in more detail below. Saunders et al. (2012) warn that if you are not sure about the level of detail you need in your research study, it is safest to collect data at the highest level of precision possible. Read: Categorical data analysis In simple terms, these data are used for different analysis purposes. Table 24.1 suggests some typical uses and the kinds of statistical tests that are appropriate. As Diamantopoulos and Schlegelmilch (1997) point out, the four kinds of measurement scale are nested within one another: as we move from a lower level of measurement to a higher one, the properties of the lower type are retained. Thus, all the statistical tests appropriate to the lower type of data can be used with the higher types as well as additional, more powerful tests. But this does not work in reverse: as we move from, say, interval data to ordinal, the tests appropriate for the former cannot be applied to the latter. For categorical data only, non-parametric statistical tests can be used, but for quantifiable data (see Figure 24.1 ), more powerful parametric tests need to be applied. Hence, in planning data collection it is better to design data gathering instruments that yield interval and ratio data, if this is appropriate to the research objectives. Let us look at each of the four data categories in turn. Figure 24.1 Types of categorical and quantifiable data Table 24.1 Nominal data Define: Nominal scale Nominal data constitute a name value or category with no order or ranking implied (for example, sales departments, occupational descriptors of employees, etc.). A typical question that yields nominal data is presented in Figure 24.2 , with a set of data that results from this presented in Table 24.2 . Thus, we can see that with nominal data, we build up a simple frequency count of how often the nominal category occurs. Figure 24.2 Types of questions that yield nominal data Table 24.2 Ordinal data Define: Ordinal measure Ordinal data comprise an ordering or ranking of values, although the intervals between the ranks are not intended to be equal (for example, an attitude questionnaire). A type of question that yields ordinal data is presented in Figure 24.3 . Here there is a ranking of views (Sometimes, Never, etc.) where the order of such views is important but there is no suggestion that the differences between each scale are identical. Ordinal scales are also used for questions that rate the quality of something (for example, very good, good, fair, poor, etc.) and agreements (for example, Strongly Agree, Agree, Disagree, etc.). The typical results of gathering ordinal data are taken from Figure 24.3 and presented in Table 24.3 . Interval data Define: Interval measure With quantifiable measures such as interval data, numerical values are assigned along an interval scale with equal intervals, but there is no zero point where the trait being measured does not exist. For example, a score of zero on a traditional IQ test would have no meaning. This is because the traditional IQ score is the raw (actual) score converted into a mental age divided by chronological age. Another characteristic of interval data is that the difference between a score of 14 and 15 would be the same as the difference between a score of 91 and 92. Hence, in contrast to ordinal data, the differences between categories are identical. The kinds of results from interval data are illustrated in Table 24.4 , delivered as part of a company’s aptitude assessment of staff. Figure 24.3 Types of questions that yield ordinal data Table 24.3 Table 24.4 Ratio Data Define: Ratio scale Ratio data are a sub-set of interval data, and the scale is again interval, but there is an absolute zero that represents some meaning – for example, scores on an achievement test. If an employee, for example, undertakes a work-related test and scores zero, this would indicate a complete lack of knowledge or ability in this subject! An example of ratio data is presented in Table 24.5 . This sort of classification scheme is important because it influences the ways in which data are analysed and what kind of statistical tests can be applied. Having incorporated variables into a classification scheme, the next stage is to look at how data should be captured and laid out, prior to analysis and presentation. Table 24.5 Data entry, layout and quality Data entry involves a number of stages, beginning with ‘cleaning’ the data, planning and implementing the actual input of the data, and dealing with the thorny problem of missing data. Ways of avoiding the degradation of data will also be discussed. Cleaning the data Watch: Cleaning data Data analysis will only be reliable if it is built upon the foundations of ‘clean’ data: that is, data that have been entered into the computer accurately. When entering data containing a large number of variables and many individual records, it is easy to enter a wrong figure or to miss an entry. One solution is for two people to enter data separately and to compare the results, but this is expensive. Another approach is to use frequency analysis on a column of data that will throw up any spurious figures that have been entered. For example, if you are using numbers 1 to 5 to represent individual codes for each of five variables, the frequency analysis might show that you had also entered the number 8 – clearly a mistake. Where there are branching or skip questions (recall Chapter 14 ) it may also be necessary to check that respondents are going through the questions carefully. For example, they may be completing sections that do not apply to them or missing other sections. Data coding and layout Coding usually involves allocating an identification number (Id) to data. Take care, however, not to make the mistake of subsequently analysing the codes as raw data! The codes are merely shorthand ways of describing the data. Once the coding is completed, it is possible to collate the data into groups of less detailed categories. So, in Case Study 24.1 the categories could be recoded to form the groups Legal and Financial and then Health and Safety. The most obvious approach to data layout is the use of tables in the form of a data matrix. Within each data matrix, columns will represent a single variable while each row presents a case or profile. Hence, Table 24.6 illustrates an example of data from a survey of employee attitudes. The second column, labelled ‘Id’, is the survey form identifier, allowing the researcher to check back to the original survey form when checking for errors. The next column contains numbers, each of which signifies a particular department. Length of service is quantifiable data representing actual years spent in the organization, while seniority is again coded data signifying different scales of seniority. Thus, the numerical values have different meanings for different variables. Note that Table 24.6 is typical of the kind of data matrix that can be set up in a computer program such as SPSS, ready for the application of statistical formulae. Case Study 24.1 illustrates the kind of survey layout and structure that yields data suitable for a data matrix (presented at the end of the case study). Hence, we have a range of variables and structured responses, each of which can be coded. Table 24.6 Case Study 24.1 From survey instrument to data matrix A voluntary association that provides free advice to the public seeks to discover which of its services are most utilized. A survey form is designed dealing with four potential areas, namely the law, finance, health and safety in the home. Question: Please look at the following services and indicate whether you have used any of them in the last 12 months. The data are collected from 100 respondents and input into the following data matrix using the numerical codes: 1 = Yes; 2 = No. For no data or non-response the cell should be left blank. Note that Respondent 3 has ticked the box for ‘Legal advice’ but has failed to complete any of the others – hence, a ‘0’ for no data has to be put in the matrix. Dealing with missing data Oppenheim (1992) notes that the best approach to dealing with missing data is not to have any! Hence, steps should be taken to ensure that data are collected from the entire intended sample and that non-response is kept to a minimum. But in practice, we know that there will be cases where a respondent either has not replied or has not answered all the questions. The issue here is one of potential bias – has the respondent omitted those questions s/he feels uneasy about or hostile to answering? For example, in answering a staff survey on working practices, are those with the worst records on absenteeism more likely to omit the questions on this (hence, potentially biasing the analysis)? It might be useful to distinguish between four different types of missing values: ‘Not applicable’ (NA), ‘Refused’ (RF), ‘Did not know’ (DK) and ‘Forgot to answer’ (FA). Making this distinction may help you to adopt strategies for coping with this data loss. Table 24.7 illustrates examples of these responses. Table 24.7 You may note that the categories for non-response chosen may depend largely on the researcher’s inferences or guesswork. How do we know that someone forgot to answer or simply did not know how to respond? Of course, if many people fail to answer the same question, this might suggest there is something about the question they do not like – in which case, this could be construed as ‘Refusal’. You may decide to ignore these separate categories and just use one ‘No answer’ label. Alternatively, you might put in a value if this is possible by taking the average of other people’s responses. There are dangers, however, in this approach, particularly for single item questions. Note that some statisticians have spent almost a lifetime pondering issues of this kind! It would be safer if missing data were entered for a sub-question that comprised just one of a number of sub-questions (for which data were available). Note, also, that this becomes unfeasible if there are many non-responses to the same question, since it would leave the calculation based on a small sample. Avoiding the degradation of data It is fairly clear when non-response has occurred, but it is also possible to compromise the quality of data by the process of degradation. Say we were interested in measuring the age profile of the workforce and drew up a questionnaire, as illustrated in Figure 24.4 . One problem here is that the age categories are unequal (for example, 18–24 compared with 25–34). But a further difficulty is the loss of information that comes with collecting the data in this way. We have ended up with an ordinal measure of what should be ratio data and cannot even calculate the average age of the workforce. Far better would have been simply to ask for each person’s exact age (for example, by requesting their date of birth) and the date the questionnaire was completed. After this, we could calculate the average age (mean), the modal (most frequently occurring) age and identify both the oldest and youngest worker, etc. Figure 24.4 Section of questionnaire comprising an age profile Presenting data using descriptive statistics One of the aims of descriptive statistics is to describe the basic features of a study, often through the use of graphical analysis. Descriptive statistics are distinguished from inferential statistics in that they attempt to show what the data are, while inferential statistics try to draw conclusions beyond the data – for example, inferring what a population may think on the basis of sample data. Descriptive statistics, and in particular the use of charts or graphs, certainly provide the potential for the communication of data in readily accessible formats, but the kinds of graphics used will depend on the types of data being presented. This is why the start of this chapter focused on classifying data into nominal, ordinal, interval and ratio categories, since not all types of graph are appropriate for all kinds of data. Black (1999) provides a neat summary of what is appropriate (see Table 24.8 ). Table 24.8 Source: Adapted from Black, 1999: 306 Build: Choosing graph types Employability Skill 24.1 Selecting appropriate graphs and tables for the presentation of information Selecting appropriate charts and graphs is the key to making your data meaningful and communicating your message to your audience. The aim is to summarize and organize your material in the most easily understood format for the type of data that you have. Nominal and ordinal data – single groups As we saw earlier, nominal data are a record of categories or names, with no intended order or ranking, while ordinal data do assume some intended ordering of categories. Taking the nominal data in Table 24.2 , we can present a bar chart ( Figure 24.5 ) for the frequency count of staff in different departments. Figure 24.5 Bar chart for the nominal data in Table 24.2 Figure 24.6 shows that this same set of data can also be presented in the form of a pie chart. Note that pie charts are suitable for illustrating nominal data but are not appropriate for ordinal data – because a pie chart presents proportions of a total, not the ordering of categories. Figure 24.6 Pie chart of the nominal data in Figure 24.5 Interval and ratio data – single groups Interval and ratio data describe scores on tests, age, weight, annual income, etc., for a group of individuals. These numbers are then, usually, translated into a frequency table, such as in Table 24.3 . The first stage is to decide on the number of intervals in the data. Black (1999) recommends between 10 and 20 as acceptable, since going outside this range would tend to distort the shape of the histogram or frequency polygon. Take a look at the data on an age profile of the entire workforce in an e-commerce development organization, presented in Table 24.9 . The age range is from 23 to 43, a difference of 21. If we selected an interval range of three, this would only give us a set of seven age ranges and conflict with Black’s (1999) recommendation that only a minimum of 10 ranges is acceptable. If, however, we took two as the interval range, we would end up with 11 sets of intervals, as in Table 24.10 , which is acceptable. We then take these data for graphical presentation in the form of a histogram, as in Figure 24.7 . Table 24.9 Table 24.10 Figure 24.7 Histogram illustrating interval data in Table 24.10 Nominal data – comparing groups So far, we have looked at presenting single sets of data. But often research will require us to gather data on a number of related characteristics and it is useful to be able to compare these graphically. For example, returning to Table 24.2 and the number of employees per department, these may be aggregate frequencies, based on the spread of both male and female workers per department, as in Figure 24.8 . Another way of presenting these kinds of data is where it is useful to show not only the distribution between groups, but also the total size of each group, as in Figure 24.9 . Figure 24.8 Bar chart for nominal data with comparison between groups Figure 24.9 Stacked bar chart for nominal data with comparison between groups Interval and ratio data – comparing groups It is sometimes necessary to compare two groups for traits that are measured as continuous data. While this exercise is, as we have seen, relatively easy for nominal data that are discrete, for interval and ratio data the two sets of data may overlap and one hide the other. The solution is to use a frequency polygon. As we can see in Figure 24.10 , we have two sets of continuous data of test scores, one set for a group of employees who have received training and another for those who have not. The frequency polygon enables us to see both sets of results simultaneously and to compare the trends. Figure 24.10 Frequency polygons for two sets of continuous data showing test scores Two variables for a single group You may also want to compare two variables for a single group. Returning once more to our example of departments, we might look at the age profiles of the workers in each of them. Figure 24.11 shows the result. Figure 24.11 Solid polygon showing data for two variables: department and age Analysing data using descriptive statistics Watch: Analysing and presenting data A descriptive focus involves the creation of a summary picture of a sample or population in terms of key variables being researched. This may involve the presentation of data in graphical form (as in the previous section) or the use of descriptive statistics, as discussed here. Frequency distribution and central tendency Frequency distribution is one of the most common methods of data analysis, particularly for analysing survey data. Frequency simply means the number of instances in a class, and in surveys it is often associated with the use of Likert scales. So, for example, a survey might measure customer satisfaction for a particular product over a two-year period. Table 24.11 presents a typical set of results, showing what percentage of customers answered for each attitude category to the statement: ‘We think that the Squeezy floor cleaner is good value for money.’ Table 24.11 Comparing the data between the two years, it appears that there has been a 7 per cent rise in the number of customers who ‘Strongly agree’ that the floor cleaner is good value for money. Unfortunately, just to report this result would be misleading because, as we can see, there has also been a 6 per cent rise in those who ‘Strongly disagree’ with the statement. So what are we to make of the results? Given that the ‘Agree’ category has fallen by 7 per cent and the ‘Disagree’ category by 6 per cent, have attitudes moved for or against the product? To make sense of the data, two approaches need to be adopted. · The use of all the data, not just selected figures that meet the researcher’s agendas. · A way of quantifying the results using a single, representative figure. This scoring method involves the calculation of a mean score for each set of data. Hence the categories could be given a score, as illustrated in Table 24.12 . All respondents’ scores can then be added up, yielding the set of scores presented in Table 24.13 , and the mean, showing that, overall, attitudes have moved very slightly in favour of the product. Table 24.12 Since the data can be described by the mean, a single figure, it becomes possible to make comparisons between different parts of the data or, if, say, two surveys are carried out at different periods, across time. Of course, there are also dangers in this approach. There is an assumption (possibly a mistaken one) that the differences between these ordinal categories are identical. Furthermore, the mean is only one measure of central tendency , others include the median and the mode . The median is the central value when all the scores are arranged in order. The mode is simply the most frequently occurring value. If the median and mode scores are less than the mean, the distribution of scores will be skewed to the left (positive skew); if they are greater than the mean, the scores are said to be skewed to the right (negative skew). So, while two mean scores could be identical, this need not imply that two sets of scores were the same, since each might have a different distribution of scores. Table 24.13 Having made these qualifications, this scoring method can still be used, but is probably best utilized over a multiple set of scores rather than just a single set. It is also safest used for descriptive rather than for inferential statistics. Measuring dispersion In addition to measuring central tendency, it may also be important to measure the spread of responses around the mean to show whether the mean is representative of the responses or not. There are a number of ways of calculating measures of dispersion : · The range : the difference between the highest and the lowest scores. · The inter-quartile range: the difference between the score that has a quarter of the scores below it (often known as the first quartile or the 25th percentile) and the score that has three-quarters of the scores below it (the 75th percentile). · The variance: a measure of the average of the squared deviations of individual scores from the mean. · The standard deviation: a measure of the extent to which responses vary from the mean, and is derived by calculating the variation from the mean, squaring them, adding them and calculating the square root. Like the mean, because you are able to calculate a single figure, it allows comparisons to be made between different parts of a survey and across time periods. Normal and skewed distributions The normal distribution curve is bell-shaped, that is symmetrical around the mean, which means that there are an equal number of subjects above and below the mean (x–). The shape of the curve also indicates the proportion of subjects at each of the standard deviations (SD, 1SD, etc.) above and below the mean. Thus in Figure 24.12 , 34.13 per cent of the subjects are one standard deviation above the mean and another 34.13 per cent below it. In the real world, however, it is often the case that distributions are not normal, but skewed, and this will have implications for the relationship between the mean, the mode and the median. A distribution is said to be skewed if one of its tails is longer than the other. Where the distribution is positively skewed, it has a long tail in a positive direction (to the right) and the majority of the subjects are below, to the left of the mean in terms of the trait or attitude being measured. With a negative skew, the tail is in a negative direction (to the left) and the majority of subjects are above the mean (to the right). Figure 24.12 The theoretical ‘normal’ distribution with mean = 0 The process of hypothesis testing: inferential statistics Explore: Descriptive vs inferential statistics We saw in Chapter 3 that the research process may involve the formulation of a hypothesis or hypotheses that describe the relationship between two variables. In this section we will re-examine hypothesis testing in a number of stages, which comprise: · Hypothesis formulation. · Specification of significance level (to see how safe it is to accept or reject the hypothesis). · Identification of the probability distribution and definition of the region of rejection. · Selection of appropriate statistical tests. · Calculation of the test statistic and acceptance or rejection of the hypothesis. Hypothesis formulation As we saw in Chapter 3 , a hypothesis is a statement concerning a population (or populations) that may or may not be true, and constitutes an inference or inferences about a population, drawn from sample information. Let us say that we are interested in the relationship between corporate entrepreneurship and strategic management. According to Schumpeter (1950) entrepreneurship involves the introduction of new products, new methods of production and other innovations. Barrington and Bluedorn (1999) suggest that strategic management involves five dimensions, namely: scanning intensity, locus of planning, planning flexibility, planning horizon and control attributes. Taking just the first dimension, scanning intensity is the managerial activity of learning about events and trends in an organization’s environment, which should yield new business opportunities. Hence, Barrington and Bluedorn (1999) formulate a hypothesis in the following manner: Hypothesis 1: A positive relationship exists between scanning intensity and corporate entrepreneurship intensity. However, we can never ‘prove’ something to be true, because there always remains a finite possibility that one day someone will emerge with a refutation. Hence, for research purposes, we usually phrase a hypothesis in its null (negative) form. So, we would state the hypothesis as: Hypothesis 1: There is no relationship between scanning intensity and corporate entrepreneurship intensity. Then, if we find that a statistically significant relationship exists, we can reject the null hypothesis . Hypotheses come in essentially three forms. Those that: · Examine the characteristics of a single population (and may involve calculating the mean, median and standard deviation and the shape of the distribution). · Explore contrasts and comparisons between groups. · Examine associations and relationships between groups. For one research study, it may be necessary to formulate a number of null hypotheses incorporating statements about distributions, scores, frequencies, associations and correlations. Specification of significance level Having formulated the null hypothesis, we must next decide on the circumstances in which it will be accepted or rejected. Since we do not know with absolute certainty whether the hypothesis is true or false, ideally we would want to reject the null hypothesis when it is false, and to accept it when it is true. However, since there is no such thing as an absolute certainty (especially in the real world!), there is always a chance of rejecting the null hypothesis when in fact it is true (called a Type I error ) and accepting it when it is in fact false (a Type II error ). Table 24.14 presents a summary of possible outcomes. Table 24.14 What is the potential impact of these errors? Say, for example, we measure whether a new training programme improves staff attitudes to customers, and we express this in null terms (the training will have no effect). If we made a Type I error then we are rejecting the null hypothesis, and therefore claim that the training does have an effect when, in fact, this is not true. You will, no doubt, recognize that we do not want to make claims for the impact of independent variables that are actually false. Think of the implications if we made a Type I error when testing a new drug! We also want to avoid Type II errors, since here we would be accepting the null hypothesis and therefore failing to notice the impact that an independent variable was having. Type I and Type II errors are the converse of each other. As Fielding and Gilbert (2006) observe, anything we do to reduce a Type I error will increase the likelihood of a Type II error, and vice versa. Whichever error is the most likely depends on how we set the significance level (see following section). Identification of the probability distribution What are the chances of making a Type I error? This is measured by what is called the significance level , which measures the probability of making a mistake. The significance level is always set before a test is carried out, and is traditionally set at either 0.05, 0.01 or 0.001. Thus, if we set our significance level at 5 per cent (p = .05), we are willing to take the risk of rejecting the null hypothesis when in fact it is correct 5 times out of 100. All statistical tests are based on an area of acceptance and an area of rejection . For what is termed a one-tailed test , the rejection area is either the upper or lower tail of the distribution. A one-tailed test is used when the hypothesis is directional: that is, it predicts an outcome at either the higher or lower end of the distribution. But there may be cases when it is not possible to make such a prediction. In these circumstances, a two-tailed test is used, for which there are two areas of rejection – both the upper and lower tails. For example, for the z distribution where p = .05 and a two-tailed test, statistical tables show that the area of acceptance for the null hypothesis is the central 95 per cent of the distribution and the areas of rejection are the 2.5 per cent of each tail (see Figure 24.13 ). Hence, if the test statistic is less than -1.96 or greater than 1.96 the null hypothesis will be rejected. Figure 24.13 Areas of acceptance and rejection in a standard normal distribution with an α of .05 Selection of appropriate statistical tests The selection of statistical tests appropriate for each hypothesis is perhaps the most challenging feature of using statistics but also the most necessary. It is all too easy to formulate a valid hypothesis only to choose an inappropriate test, with the result – statistical nonsense! The type of statistical test used will depend on quite a broad range of factors. Watch: Selecting statistical tests Firstly, the type of hypothesis – for example, hypotheses concerned with the characteristics of groups, compared with relationships between variables. Even within these broad groups of hypotheses different tests may be needed. So a test for comparing differences between group means will be different to one comparing differences between medians. Even for the same sample, different tests may be used depending on the size of the sample. Secondly, assumptions about the distribution of populations will affect the type of statistical test used. For example, different tests will be used for populations for which the data are evenly distributed compared with those that are not. A third consideration is the level of measurement of the variables in the hypothesis. As we saw earlier, different tests are appropriate for nominal, ordinal, interval and ratio data, and only non-parametric tests are suitable for nominal and ordinal data, but parametric tests can be used with interval and ratio data. Parametric tests also work best with larger sample sizes (that is, at least 30 observations per variable or group) and are more powerful than non-parametric tests. This simply means that they are more likely to reject the null hypothesis when it should be rejected, avoiding Type I errors. Motulsky (1995) advises that parametric tests should usually be selected if you are sure that the population is normally distributed. Table 24.15 provides a summary of the kinds of statistical test available in the variety of circumstances just described. Table 24.15 Source: Adapted from Fink, 2003 In the sections that follow, we will take some examples from Table 24.15 and apply them for the purpose of illustration. Statistical analysis: comparing variables In this section and the one that follows, we will be performing a number of statistical tests. It will be assumed that readers will have access to SPSS. Nominal data – one sample In the following section we will look at comparing relationships between variables, but here we will confine ourselves to exploring the distribution of a variable. Firstly, if we assume a pre-specified distribution (such as a normal distribution) we can compare the observed (actual data) frequencies against expected (theoretical) frequencies , to measure what is termed the goodness-of-fit . Let us say that a company is interested in comparing disciplinary records across its four production sites by measuring the number of written warnings issued in the past two years. We might assume that, since the sites are of broadly equal size in terms of people employed, the warnings might be evenly spread across these sites, that is 25 per cent for each. Since the total number of recorded written warnings is 116 (see Table 24.16 ), this represents 29 expected warnings per site. Data are gathered ( observed frequencies ) to see if they match the expected frequencies. The null hypothesis is that there will be no difference between the observed and expected frequencies. Following our earlier advice, we set the level of significance in advance. In this case let us say that we set it at p = .05. If any significant difference is found, then the null hypothesis will be rejected. Table 24.16 presents the data in what is called a contingency table . Table 24.16 The appropriate test here is the chi-square distribution . For each case we deduct the expected frequency from the observed frequency, square the result and divide by the expected frequency; the chi-square statistic is the sum of the totals (see Table 24.17 ). Is the chi-square statistic of 71.86 significant? To find out, we look up the figure in an appropriate statistical table for the chi-square statistic. The value to use will be in the column for p = .05 and for 3 degrees of freedom (the number of categories minus one). This figure turns out to be 7.81, which is far exceeded by our chi-square figure. Hence, we can say that the difference is significant and we can reject the null hypothesis that there is no difference between the issue of written warnings between the sites. Table 24.17 Table 24.16 Note, however, that the expected frequencies do not have to be equal. Say, we know through some prior research that site B is three times as likely to issue warnings as the other sites. Table 24.18 presents the new data. Table 24.18 Here we find that the new chi-square statistic is only 6.34, which is not significant. Diamantopoulos and Schlegelmilch (1997) warn that when the number of categories in the variable is greater than two, the chi-square test should not be used where: · More than 20 per cent of the expected frequencies are smaller than five. · Any expected frequency is less than one. If the numbers within cells are small, and it is possible to combine adjacent categories, then it is advisable to do so. For example, if some of our expected frequencies in Table 24.14 were rather small but sites A and B were in England and site C and D in Germany, we might sensibly combine A with B and C with D in order to make an international comparison study. Nominal groups and quantifiable data (normally distributed) Let us say that you want to compare the performance of two groups, or to compare the performance of one group over a period of time using quantifiable variables such as scores. In these circumstances we can use a paired t-test . If we were to have two different samples of people for which we wish to compare scores, then we would use an independent t -test . It is assumed that in t-tests the data are normally distributed, and that the two groups have the same variance (the standard deviation squared). If the data are not normally distributed then usually a non-parametric test, the Wilcoxon signed-rank test, can be used – although, as we shall see, t-tests can be used even when the distribution is not perfectly normal. The t-test compares the means of the two groups to see if any differences between them are statistically significant. If the p -value associated with t is low (< .05), then there is evidence to accept the alternate hypothesis (and reject the null hypothesis): that is, the means of the two groups are statistically different. Say that we want to examine the effectiveness of a workplace stress counselling programme. Taking a simple before and after design (recall Chapter 6 for some of the limitations of this design), we get respondents to complete a stress assessment questionnaire before the counselling and then after it. We can see from the data set provided (see the book’s website and the link to Data sets: t-test data) that in a number of cases the levels of stress have actually increased! But in most cases stress levels have fallen, in some cases quite sharply. Worked Example 24.1 shows how we can use SPSS to see if this is statistically significant. Practise: t-tests Worked Example 24.1 Type the gain scores for both the experimental and control groups into an SPSS data file. Before we begin any data analysis, we need to determine the normality of the data distribution, since this will influence whether we should use parametric or non-parametric statistical tests. Remember that parametric tests are the more powerful, but can only be used if the data are relatively normally distributed. For this worked example, use the t-test dataset (available as part of the online resources). Save the data and open it in SPSS. 1. Click on [Analyze], then on [Descriptive statistics] followed by [Explore]. 2. Click on Experimental A and Experimental B and move them into the [Dependent List] box by clicking on the arrow. 3. In the [Display] section make sure that [Both] is ticked. 4. Click on [Statistics] and then on [Descriptives] and [Outliers]. Click on [Continue]. 5. Click on the [Plots] button. Then under [Descriptive] click on [Histogram]. Select [Normality plots with tests] and [Continue]. 6. Click on the [Options] button and in the [Missing values section] select [Exclude cases pairwise]. To complete the process click on [Continue] followed by [OK]. 7. You should then see the data as presented in the outputs below. Only a partial list of cases with the value 16.00 are shown in the table of upper extremes. Only a partial list of cases with the value 13.00 are shown in the table of upper extremes. Only a partial list of cases with the value 3.00 are shown in the table of lower extremes. Lilliefors Significance Correction In the Descriptives output, note the statistic for 5\% Trimmed Mean. SPSS removes the top and bottom 5 per cent of cases and recalculates this new mean, to see if extreme scores (outliers) have much impact. In our example above, the mean and trimmed means are very similar so we should not be concerned about outliers distorting the results. The output also provides values for skewness and kurtosis. Skewness provides an indication of the symmetry of the distribution and (as discussed above) can be reported as positive (if scores are clustered to the left) and negative (if clustered to the right). Kurtosis refers to the peakness or otherwise of the distribution. Values of less than 0 indicate a relatively flat distribution; that is, too many cases at the extremes (as in Experimental A example). Watch: Testing for normality The table labelled Test for Normality contains the Shapiro–Wilk statistic, which is generally used for samples ranging from 3 to 2,000. Above 2,000 the Kolmogorov–Smirnov statistic is generally used to test for the normality of the distribution. A result where the Sig. value is more than 0.05 indicates normality, while a result that is less than 0.05 violates the assumption of normality. Given that the sample size in this study is below 100 we will use the Shapiro–Wilk statistic. In the above table we can see that the statistic for Experimental A is above 0.05, indicating normality, whereas the statistic for Experimental B is below 0.05, violating the assumption of normality. Does this mean that we must use a non-parametric test? Not necessarily. For sample sizes over 30, Pallant (2013) suggests that violation of the normality assumption should not lead the researcher to panic, with use of parametric tests being permissible. The next step is to take a look at the results for Skewness and Kurtosis in the Descriptives table. As long as these are between -1.0 and +1.0, we can assume that the distribution is sufficiently normal for the use of parametric tests. Hence, we apply the procedure for a paired sample t-test as follows: 1. Click on [Analyze] then on [Compare Means] and then on [Paired Samples T-test]. 2. Click on the variables Experimental A and Experimental B and on the arrow to move them into the [Paired Variables] box. 3. Click on [OK]. You should see the output as presented below. The procedure for interpreting these results is as follows. 1. Look at the Paired Samples Tests, at the right-hand column labelled Sig. (2-tailed) which gives the probability value. If this is less than 0.05 then we can assume that the difference between the two scores is significant. In our case the Sig. = 0.00 so the differences in the stress scores is, indeed, significant. 2. Establish which set of scores is the higher (Experimental A or Experimental B). The box Paired Samples Statistics gives the mean for each set of scores. The mean for Experimental A was 10.3736 while that for Experimental B was 8.4176. We can therefore conclude that the workplace counselling programme did, indeed, help to reduce stress. Now a note of caution. Although we obtained differences in the two sets of scores (and the Sig. result suggests that this did not occur by chance alone), we must be careful when it comes to attributing causation. We also need to take into account other factors that could explain the fall in stress levels – refer to ‘Design 3: One group, pre-test/post-test’ in Chapter 6 . The researcher should try to anticipate the kinds of contaminating factors that could confound the results. One approach would be to improve the research design – for example, by introducing a control group that does not receive the intervention (in this case the stress counselling). Read: Normality test Nominal groups and quantifiable data (not normally distributed) In the section above we looked at differences in normally (or near normally) distributed data. But what if the data do not satisfy the assumptions required for statistical tests based on a normal distribution? Let us say that we are exploring the attitudes of men and women towards the purchase of skin care products. Do women prefer these types of product more than men? Figure 24.14 provides an example of part of a survey dealing in attitudes towards personal grooming. The resulting data from this imaginary survey are provided on the book’s website (see Data sets: Mann-Whitney U data). The data are captured into an SPSS file, with each questionnaire being allocated its own Id number. Male respondents are allocated the code 1 and females 2. The response of each person is allotted a score by adding their responses. Note that in Figure 24.14 , question 3 has been posed in a negative form to encourage respondents to think more carefully about their answers. This needs to be allocated a score of 1. Hence, the total score for this respondent would be coded as 6. Total scores for each respondent range from 4 to 20. Figure 24.14 Example of a portion of a survey on skin care products Practise: Mann- Whitney U tests Watch: Doing a Mann-Whitney U test Worked Example 24.2 For this worked example, use the Mann-Whitney dataset (available as part of the online resources). Save the data and open it in SPSS. First of all, we test for whether the data are normally distributed (see Worked Example 24.1 for how to test for this). Note that as we have both a dependent variable (attitude) and an independent variable (sex), you can generate data for both male and female groups by moving the categorical variable (sex) into the [Factor List] box in the [Explore] dialogue box. Looking at the Kolmogorov–Smirnov statistic in the Tests for Normality table below, we note that the figure for Sig. is 0.00, indicating that the assumption of normality has been violated. Rather than an independent t-test, we now need to make use of its non-parametric alternative, the Mann-Whitney U test . Lilliefors Significance Correction The procedure for the Mann–Whitney U test is as follows: 1. Click on [Analyze], then on [Nonparametric Tests], followed by [2 Independent Samples]. 2. Click on the dependent variable [Attitudes] and the arrow to move it into the [Test Variable List] box. 3. Click on the categorical (independent) variable [sex] and the arrow to move this into the [Group Variable] box. 4. Click on the [Define Groups] button. In the [Group 1] box input the number ‘1’, and in the [Group 2] box, input ‘2’ to match sex Id numbers in the data set. Click on [Continue]. 5. Click on [Mann-Whitney U] box under the label [Text Type]. 6. Click on [Options] and then [Descriptive]. Then click on [Continue] and finally, [OK]. You should see the output as presented below. Grouping Variable: Sex To analyse the data, look at the Test Statistics box for the value of Z and the significance level. The Z value has a significance level of 0.000. Given that this figure is lower than the probability value of 0.05, we can say that this result is significant. Since the result is significant we now need to make reference to the [Ranks] box and particularly the differences between the mean ranks, commenting on which is higher (in our example, it is females). Note that the Mann-Whitney U test is also useful in other situations. Say, for example, we employ two different training programmes that teach the same topic and want to see which is the most effective. If it cannot be assumed that the data come from a normal distribution, we would use the Mann–Whitney U test to compare the test scores of the two sets of learners. Statistical analysis: associations between variables This section examines situations where the study contains two independent variables of the same type (nominal, ordinal, interval/ratio). Table 24.19 illustrates the different kinds of measurement of association between two variables, depending on the type of variable involved. Table 24.19 Associations between two nominal variables Sometimes we may want to investigate relationships between two nominal variables – for example: · Educational attainment and choice of career. · Type of recruit (graduate/non-graduate) and level of responsibility in an organization. You will recall in the discussions about chi-square, above, that we used the statistic to see whether the distribution of a variable occurred by chance or not. Chi-square is appropriate when you have two or more variables each of which contains at least two or more categories. Let us say that a research team is studying a coaching programme and that a set of interviews with coachees (the recipients of coaching) has indicated that, when it came to a choice of coach, many (both males and females) expressed positive preferences for female coaches. Given that these comments were made by several respondents, the researchers turned to the quantitative data to see whether this was true. Table 24.20 illustrates the observed values, that is the dataset that shows the gender of coach selected by both female and male coachees. We can see that in both cases, both male and female coachees did, indeed, choose more female than male coaches. But is this difference significant? To find out, we need to use the chi-square statistic. Worked Example 24.3 shows how SPSS can be used for this data analysis. Table 24.20 Practise: Chi square test Worked Example 24.3 For this worked example, use the Chi square dataset (available as part of the online resources). Save the data and open it in SPSS. 1. Click on [Analyze] and then on [Descriptive Statistics] followed by [Crosstabs]. 2. Click on one of the variables, for example [GenderCoachee], and then click on the arrow to move this variable to the [Rows] box. Then click on [GenderCoach] and then the arrow to move this to the [Columns] box. 3. Click on the [Statistics] button, followed by [Chi-square] and [Phi and Cramer’s V]. Then click on [Continue]. 4. Having clicked on the [Cells] button, click on [Observed] in the [Counts] box. In the [Percentages] box, click on [Row], [Column] and [Total]. 5. Click on [Continue], followed by [OK]. You should see the output as illustrated below. In analysing the above output, the first step is to ensure that one of the assumptions of the chi-square test has not been violated: that is, that the expected cell frequency should never be less than five. We can see from the footnote (b) under the Chi-Square Tests table that 0 per cent of cells have an expected count of less than five – so we have not violated the assumption. In the study we are discussing, the minimum expected count is, in fact, 33.08. a. Computed only for a 2 × 2 table b. 0 cells (.0\%) have expected count less than 5. The minimum expected count is 33.08

68068

UNIT VI DISCUSSION BOARD

Unit VI discussion board - Operations Management

CATEGORIES