QUANT homework - Information Systems
Simple Regression Pre-analysis (include ALL charts, graphs and tables you generate formatted to class expectations). Assume all variables (with the exception of the subject variable) are ratio level.
YOU MUST HAVE the Data Analysis Tool Pack add-in installed to complete this assignment. Please review the notes for the week regarding how to use the Regression tool within the Data Analysis Tool Pack.
Open the dataset associated with this assignment. Use New Capital Expense as the explanatory variable and End of Year Inventory as your response variable
Generate a single Descriptive summary table in Excel of the explanatory and response variables. That is, simultaneously enter both variables into the Descriptive Statistic tool. Discuss the mean, median, standard deviation, range, kurtosis, and skewness for EACH variable. Please DO NOT JUST LIST THE numerical summaries. Discussion involves interpretation and should result in you writing complete sentences. See the exemplar, post to Moodle, for a sample writeup.
Use Excel to calculate and the report the correlation number for these two variables. Interpret the result of the correlation number by discussing the linear strength and direction of the relation between the two variables.
Use Excel to generate and report a scatter plot for the 2 variables. The Explanatory variable should be on the horizontal axis. Interpret the form and direction of the plot. Format plot to class expectations.
Using only the visual inspection of the scatter plot, is a transformation warranted? Please review course notes and use what you notice about the dot pattern within the scatter plot as evidence to support your answer.
Use the Data Analysis Tool pack to run the regression analysis to generate a residual plot for these variables. DO NOT INTERPRET NOR REPORT THE NUMERICAL SUMMARIES generated in the regression tables. Copy and Paste ONLY the residual plot into your report. Using only the visual inspection of the residual plot, is a transformation warranted? Review course notes and use what you notice about the dot pattern within the residual plot to support your answer.
Check the 4 simple regression assumptions. Reference class notes so that you are clear what the checks should be and how to perform them in Excel. Create 4 subheadings to discuss EACH of the 4 assumptions indicating if the assumptions were met or were violated. Under each subheading, also include any tables or charts you’ve generated as evidence for your conclusions. Format all charts and tables to class expectations (see exemplar). Your claims regarding the assumptions must be based on content contained within the tables, numerical summaries , or charts that you generated. Please reference them by name in your write up. For instance “ It can be seen in Figure 1 ….” Etc …
If any of the simple regression assumptions were violated, attempt to fix them using any combination of the 8 basic mathematical transformations (applied to either the explanatory or response variable or both) as presented in the notes for this week. Discuss in detail which transformation approach you decided to take and why. DO NOT SUBMIT the transformed dataset for grading. But, please report all charts or tables generated in Excel that you used to support your transformation efforts.
Homework – Week 5
Please submit your assignment to Moodle in PDF format only. Tables, charts, and graphs should be formatted as expected for the class. Font size of 11 pt or higher. DO NOT SUBMIT YOUR EXCEL DATASET. Copy and paste the numerical prompts below into your submission document and record your responses , as applicable, directly beneath. Include the following in your submission:
Simple Regression Pre-analysis (include ALL charts, graphs and tables you generate formatted to class expectations). Assume all variables (with the exception of the subject variable) are ratio level.
1. YOU MUST HAVE the Data Analysis Tool Pack add-in installed to complete this assignment. Please review the notes for the week regarding how to use the Regression tool within the Data Analysis Tool Pack.
2. Open the dataset associated with this assignment. Use New Capital Expense as the explanatory variable and End of Year Inventory as your response variable
3. Generate a single Descriptive summary table in Excel of the explanatory and response variables. That is, simultaneously enter both variables into the Descriptive Statistic tool. Discuss the mean, median, standard deviation, range, kurtosis, and skewness for EACH variable. Please DO NOT JUST LIST THE numerical summaries. Discussion involves interpretation and should result in you writing complete sentences. See the exemplar, post to Moodle, for a sample writeup.
4. Use Excel to calculate and the report the correlation number for these two variables. Interpret the result of the correlation number by discussing the linear strength and direction of the relation between the two variables.
5. Use Excel to generate and report a scatter plot for the 2 variables. The Explanatory variable should be on the horizontal axis. Interpret the form and direction of the plot. Format plot to class expectations.
6. Using only the visual inspection of the scatter plot, is a transformation warranted? Please review course notes and use what you notice about the dot pattern within the scatter plot as evidence to support your answer.
7. Use the Data Analysis Tool pack to run the regression analysis to generate a residual plot for these variables. DO NOT INTERPRET NOR REPORT THE NUMERICAL SUMMARIES generated in the regression tables. Copy and Paste ONLY the residual plot into your report. Using only the visual inspection of the residual plot, is a transformation warranted? Review course notes and use what you notice about the dot pattern within the residual plot to support your answer.
8. Check the 4 simple regression assumptions. Reference class notes so that you are clear what the checks should be and how to perform them in Excel. Create 4 subheadings to discuss EACH of the 4 assumptions indicating if the assumptions were met or were violated. Under each subheading, also include any tables or charts you’ve generated as evidence for your conclusions. Format all charts and tables to class expectations (see exemplar). Your claims regarding the assumptions must be based on content contained within the tables, numerical summaries , or charts that you generated. Please reference them by name in your write up. For instance “ It can be seen in Figure 1 ….” Etc …
9. If any of the simple regression assumptions were violated, attempt to fix them using any combination of the 8 basic mathematical transformations (applied to either the explanatory or response variable or both) as presented in the notes for this week. Discuss in detail which transformation approach you decided to take and why. DO NOT SUBMIT the transformed dataset for grading. But, please report all charts or tables generated in Excel that you used to support your transformation efforts.
Exemplar: Homework 1 Quant 530 1
Homework 1 : Exemplar
1. Generate descriptive statistics tables for each variable. Discuss measures of
center, variability, skewness, kurtosis, and maximum and minimum values for
each variable individually (5pts). Discuss a comparison of the mean and
variability for both variables. (5 pts)
Table 1
Descriptive statistic tables
Approach 1 Approach 2
Mean 14.4 Mean 12.3
Standard Error 1.377259 Standard Error 1.452946
Median 14 Median 12
Mode 12 Mode 10
Standard
Deviation 6.159289
Standard
Deviation 6.497773
Sample Variance 37.93684 Sample Variance 42.22105
Kurtosis 0.110085 Kurtosis -0.74375
Skewness 0.337807 Skewness -0.08879
Range 24 Range 24
Minimum 4 Minimum 0
Maximum 28 Maximum 24
Sum 288 Sum 246
Count 20 Count 20
Measures of center
Measures of the center provide an indication of what is likely the most typical number of
subscriptions sold when either sales approach is used. It can be seen from the Descriptive
Statistics tables that for both approach 1 and approach 2, the mean and median are close in
value. This suggests that the distribution of subscription sales under both approaches are near
bell in shape with most subscription sales remaining close to the typical value regardless of
approach used. Even so, the tables suggests that when approach 1 is applied, on average a
larger number of subscriptions will be sold than when approach 2 is applied.
Exemplar: Homework 1 Quant 530 2
Measures of variability
Given the near bell shape of both distributions, the Empirical Rule can be applied to provide a
range estimate for the percentage of subscriptions sold. That is for approach 1 and approach 2,
approximately 68\% of the number of subscriptions sold are between (8.2, 20.6) and (5.8, 18.8)
respectively. Agents that are applying approach 1 are more likely and at a more consistent rate
sale a higher number of subscriptions that those who applied approach 2. Furthermore, it can
be seen from the Descriptive Statistics table that there have been agents who have not been
successful selling any subscriptions while applying approach 2. This has not occurred, as of yet,
when approach 1 has been applied. In fact, the highest selling agent applied approach 1 when
accomplishing this goal.
Kurtosis and Skewness
The reported skewness numbers indicate that distributions of subscription sales under both
approaches are fairly symmetric in shape suggesting a balance in the number of agents who sell
subscriptions over and below the expected number. In fact, the reported kurtosis values
suggest that it is not very likely that large percentages of agents will sell extremely poorly or
well, beyond what is considered typical, regardless of which approach is used.
2. Generate a histogram for each variable. Discuss the shape of the histogram.
(5pts)
Figure 1: Number of subscriptions sold using Approach 1
0
1
2
3
4
5
6
7
8
9
5 10 15 20 25 30
S
u
b
sc
ri
p
ti
o
n
s
so
ld
Sale Categories
Exemplar: Homework 1 Quant 530 3
Figure 2: Number of subscriptions sold using Approach 2
Shape of histograms
Although the reported mean and median numerical summaries suggested that both approaches
resulted in near bell shaped subscription sales distributions, closer to a bi-modal shaped
distribution can be seen in Figure 2 for subscriptions sold under the method of approach 2. This
suggests that the reported mean value of approximately 12 subscriptions sold for approach 2
may not be the best representation of the number of subscriptions that are likely to be sold by
agents using this approach. If fact, it might be worth examining under which conditions are
agents more likely to sell the high of 20 as opposed to the high of 10 subscriptions under this
method. If these conditions are easily replicated, then it is possible that approach 2 may be
considered as viable (or even more so) an approach to the sell of subscriptions as approach 1 is
suggested to be.
3. Calculate the correlation coefficient between the two variables. Interpret the
results (5 pts)
A correlation of approximately .7 suggest a moderately positive relation between the
subscription sales performance of an agent applying approach 2 and their performance when
applying approach 1. That is, in general we can expect that if an agent does well selling
subscriptions using approach 1, then there is a likely chance that they will also do well when
using approach 2. It’s possible that this may mean that it is not necessarily the talent of the
sales agent that is generating the sales differences observed but the actual use of the particular
approach. More investigation is needed.
0
1
2
3
4
5
6
7
5 10 15 20 25
S
u
b
sc
ri
p
ti
o
n
s
so
ld
Sales Categories
Exemplar: Homework 1 Quant 530 4
4. Select one of the variables to be explanatory and the other to be response.
Indicate your selection. Generate a scatter plot of the two variables. Interpret
the direction, form, and strength observed through examination of the data
pattern (5 pts)
Figure 3: Scatterplot of the relation between the number of subscriptions
sold under approach 1 and approach 2
Figure 3 shows the dot pattern resulting when an agent’s sales performance using approach 1
(explanatory) is being evaluated for impact on their performance when using approach 2
(response). It appears from Figure 3 that a positive upward trending dot pattern is most
evident for low and high selling agents. That is, if agents sell subscriptions well or badly under
approach 1 they appear to be likely to also sell well or badly under approach 2. However that
same positive linear trend doesn’t appear to be as evident for those agents who tend to sell
near the mid-range of subscription levels around the reported mean value of 14.5 for approach
1. The dot pattern in that area appears more scattered and less systematically predictable.
0
5
10
15
20
25
30
0 5 10 15 20 25 30
A
p
p
ro
a
ch
2
s
a
le
s
Approach 1 Sales
Exemplar: Homework 1 Quant 530 5
5. On the scatter plot in Excel, generate the regression line and the regression
equation. Re-Write the regression equation using the variable names. Write an
interpretation of the regression equation using the variable names and the
regression coefficient values. (5 pts)
Figure 4: Scatterplot of the relation between the number of subscriptions
sold under approach 1 and approach 2 with regression equation included
The regression equation, (Approach 2 Sales) = .7(Approach 1 sales) + 2.3, suggests that for the typical
agent the change in the rate of a sales made under approach 1 is lower than under approach 2. In
other words, this suggests that more effort may be needed to generate a comparable number of
subscription sales using approach 1 than when using approach 2. Moreover, the equation suggests
that even the worst agent, that doesn’t successfully sale any subscriptions under approach 1 is likely on
average to sell approximately two subscriptions using approach 2.
y = 0.6959x + 2.2791
0
5
10
15
20
25
30
0 5 10 15 20 25 30
A
p
p
ro
a
ch
2
s
a
le
s
Approach 1 Sales
Simple Regression
Learning Objectives
• Conduct a descriptive statistics investigation prior to running a simple
regression analysis
• Check simple regression usage assumptions. Address violations to
assumptions
• Run a simple regression analysis. Interpret simple regression table
output
• Generate resulting simple regression equation from regression table
output
Simple Regression
• Examines relation between 2 variables (explanatory and response
variable)
• Begin analysis with a descriptive statistic analysis (numerical and
graphical) of each variable
• Ensure the simple regression usage assumptions are met prior to
running analysis in Excel
• It is possible that a transformation to one or both variables may need
to be applied prior to running the analysis
Simple Regression
• Examines relation between 2 variables (explanatory(x) and response
variable(y))
• Explanatory variable is examined to determine how well it serves as a
predictor for values of the response variable
• Used to qualify the strength of the linear relation between the two
variables
• The sample regression equation, generated from the analysis, is
dependent on the sample used and is an estimate of the true
population regression equation
The equation summarizes the overall dot pattern observed in a bivariate scatter plot of the two variables
(𝐵0 𝑖𝑠 𝑠𝑜𝑚𝑒𝑡𝑖𝑚𝑒𝑠 𝑢𝑠𝑒𝑑)
(𝐵1𝑖𝑠 𝑠𝑜𝑚𝑒𝑡𝑖𝑚𝑒𝑠 𝑢𝑠𝑒𝑑)
Sample Simple Regression Line
Sample Simple Linear Regression Coefficients
Intercept b0
• It is the value where the regression line cross the y-axis on the graph
• Value of the sample regression equation when the explanatory
variable x = 0
• It is the expected mean value of the response variable when x=0
• If in practice the explanatory variable never has a value of zero, then
the intercept is not interpreted
• If x=0 is outside of the range of observed data used in the sample, do not
interpret the intercept
Example interpretation of the regression intercept (number units is 1k)
𝑆𝑎𝑙𝑒𝑠 𝑑𝑜𝑙𝑙𝑎𝑟𝑠 𝑝𝑒𝑟 𝑚𝑜𝑛𝑡ℎ = 3.7
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓
𝑒𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠
+ 203.5
Interpretation : If there are no employees, the equation predicts that sales
for the month will be on average $203,500.
Intercept b0
Slope b1
• The regression slope coefficient is the mean amount of change in the
response variable that the equation predicts as the explanatory
variable is increased by 1 unit
• The sign of the regression slope coefficient indicates the direction
(positive or negative) of the linear relation between the 2 variables of
the line is positive
• If the regression slope coefficient is very near 0, then as one variable
increases, the other remains fairly constant. Therefore, there is no
meaningful predictive relationship present
Example interpretation of the regression slope (number units is 1k)
𝑆𝑎𝑙𝑒𝑠 𝑑𝑜𝑙𝑙𝑎𝑟𝑠 𝑝𝑒𝑟 𝑚𝑜𝑛𝑡ℎ = 3.7
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓
𝑒𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠
+ 203.5
Interpretation : If the number of employees increases by 1, the
equation predicts the sales dollars per month on average will increase
by approximately $3700.
Intercept b1
• Results can be read in the simple regression Excel output table
Simple Regression Usage Assumptions
• The relation between the explanatory and response variables is linear
• Check the scatter plot. Should have a linear dot pattern
• The response variable does not have a relation the residuals (independent
of the errors in prediction)
• Check a scatterplot of residuals versus predicted values (residual plot). Should have a
scattered dot pattern. Check the correlation between these values. Correlation
should be very near zero
• Residuals are normally distributed
• Check a histogram of the residuals. Check skewness numbers. Check normal
probability plots
• Variance of residuals should be the statistically the same across all values
of the explanatory variable
• Check a scatterplot of residuals versus predicted values (residual plot). Should have a
scattered dot pattern where the variance across values is consistent
Residuals
The estimates for the response variable made through the regression line are called the predicted values
• To find residuals, subtract the predicted value (calculated with the regression equation) from the
observed value
• negative residual: the regression equation is overestimating
• postive residual: the regression equation is underestimating
• Best fitting line means the sum of its least squared residuals is smallest
Scatter plot check for linearity
Residual Plot check
Assumptions met
Assumptions not met
Assumptions not met
Scattered dot pattern and equal variance
Normality Plot check
Normal Skewed Left Skewed Right
Thick tailed Thin tailed
Assumptions are
met if the residual
dot pattern closely
follows the line
Histogram Normality check
Near Normal
Right Skewed Left Skewed
Options to Fix linearity problems
• Increase the sample size
• Check for outliers and remove them if warranted or assign new
variables through imputation (beyond the scope of the course)
• Apply a transformation to the entire data set for one or both variables
• Trying transformations may be an iterative process of ‘try and check’ in order
to meet assumptions. Try fixing the linearity challenge first and then proceed
on to the other fixes as needed
• As transformations are applied, be sure to re-check all assumptions to ensure
that they all hold after application of the transformation
Applying a transformation
Often trial and error. No one technique works all the time
• Transform the explanatory (x) values only.
• Try if linearity is the only assumption violation
• Transform the response (y) values only.
• Try when non-normality and/or unequal variances are the assumption
violations
• Transform both sets of values.
• Try when linearity, non-normality , and unequal variances are the
assumption violations
Frequently used Transformations
• In simple regression, try looking at the scatter plot dot pattern for
hints on which transformation to try
Try LN transformation Try reciprocal
(1/x) or
exponential 𝑒−𝑥
transformation
Other Frequently used Transformations
• Power transformations transform the response variable to some
power (usually between -1 and 2)
• Try if the residual variances are unequal and/or residuals are not normal
• A square root or reciprocal transformation can be applied to the
response variable
• Try if the residual variances are unequal
Which tried transformation should you use?
• Run the simple regression analysis, in Excel, to generate the
regression output table for each transformation
• Examine the adjusted R squared value for each transformation and
for the original data
• Select the data set to use with the largest adjusted r squared value (or
the smallest standard error value)
What to check in the regression output table
• The overall significance of the regression equation (in the ANOVA table)
• less than .05 for the significance and/or p-values suggests the model is a good fit to the data
• The overall significance of the regression coefficients (in the coefficient table)
• less than .05 for the significance and/or p-values the explanatory variable is a good predictor of the response variable
Approach we will use in this class
• If regression equation is not significant, the model is not a good fit to the data
• If the regression model is significant but the coefficient is not, the model provides improved fit over using the expected
value of the response variable as the estimated prediction
• If the regression model is significant and the coefficient is significant, the model is a good fit for the data and the
explanatory variable is contributing significantly towards the quality of prediction
What to check in the regression output table
Interpretation: Results suggest that because the p-
value .09>.05, the presence of East variable in the
regression equation is not contributing significantly to
the prediction of the response variable.
Interpretation: Results suggest that because the
significance F value .0003<.05, the regression equation
is a good fit to the data and provides meaningful
prediction of the response variables given the presence
of the explanatory variables in the model
Other numerical summaries in the regression
output table to check
• Correlation Coefficient: Strength of Linear Relation
• Coefficient of Determination (r^2) - Use Adjusted R^2
• fraction of the variation in the data accounted for by the regression equation . Values are
between 0 and 1 in the table but are sometimes reported as percentages
• SEE (Standard Error of the Estimate)
• standard deviation of the residuals
• how spread out the observations are from the regression line
Writing the Simple Regression Equation
Report the regression equation ONLY if all simple regression
assumptions are met
Use the values in the coefficients column
𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒 𝑉𝑎𝑟𝑖𝑎𝑏𝑙𝑒 = −1.76 𝑃𝑟𝑖𝑐𝑒 𝑜𝑓 𝑅𝑜𝑠𝑒𝑠 + 183475.43
Simple Regression Equation
Writing the Simple Regression Equation
Be sure to name the transformation variable, if needed, when writing
the regression equation
𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒 𝑉𝑎𝑟𝑖𝑎𝑏𝑙𝑒 = −3.67 𝐿𝑁 𝑃𝑟𝑖𝑐𝑒 𝑜𝑓 𝑅𝑜𝑠𝑒𝑠 + 32.58
Simple Regression Equation with LN transformation on the explanatory variable
Introduction
Ideally the goal is to select explanatory variables to build the simplest regression equation possible that will produce a good estimate of the response variable (parsimony)
this to simplify interpretation of the regression coefficients
selecting good variables for use in a regression analysis usually involves initially engaging in some research
To use linear simple linear regression analysis: There needs to be a linear relation between:
1) the response variable and the explanatory variable
Investigate the linear relations by examining:
1) Correlation: the numerical measure of strength in the relation between the explanatory and response variables
2) scatter plot of the response variable against the independent variable
the plot can show linear dot pattern , indicating likely need for the explanatory variable to be included in the regression equation
the plot can show non-linear dot pattern suggesting a transformation of an explanatory variable may be needed
the plot can show a horizontal dot pattern suggesting the variable should not be used in the regression equation
3) residual plots
looking for a scattered, cloudy dot pattern with consistent variance at each level of the explanatory variable to help confirm linear relation likely exists
if a pattern is observed, this suggests transformations many need to be applied
correlation
Response variables (variable of interest) : typically placed on y-axis
explanatory variables: typically placed on x-axis
Choosing the role of the variables depends on how you think the variables are related
Correlation coefficient (r): permits you to quantify the strength of the linear relation between the 2 variables
In Excel use =CORREL feature
properties of correlation coefficient
range from -1 to 1
sign gives the direction of the association
unitless measure
not affected by changes to the center or scale of the variables
depends only on the z-scores
sensative to outliers
does not imply causation (could be lurking variables standing behind the association)
check the scatter plot for straightness even if the value of r is high
scatter
Scatterplots
displays relation (association) between 2 quantitative variables
investigor can see patterns, trends, and outliers
1. Direction of the association
negative positive
2. Form of the association
nonlinear vs linear
3. Strength of association
How discernable is the data pattern or trend?
weak, moderate, strong
residual
Question: What line can be used to model the linear data pattern displayed in the scatterplot?
the equation of the best fit line is the regression line (model)
the estimates for the response variable made through the regression line are called the predicted values
residuals
To find residuals, subtract the predicted value (calculated with the regression equation) from the observed value
negative residual: the regression equation is overestimating
postive residual: the regression equation is underestimating
Best fitting line means the sum of its least squared residuals is smallest
Example1A
Question: Is there a linear relation between the response variable (salary) and housing price?
1) correlation (CORELL feature in Excel)
2) scatter plot
3) residual plot
1) Correlation 0.9086846363 3) Residual Plot (Run the regression in Excel, using the Data Analysis Toolbar. Select Residual plot
A correlation of approximately .91 implies that salary and housing price have
a strong, positive, linear relation where salary increase ad housing price increases in a systematically predictable way
Salary($10,000) Housing Price ($10,000)
1 45 2) Scatter Plot (select variables and use the Insert .. Charts menu)
2 47
3 57
4.5 59
5.2 60
8.4 65
11 66
12.4 67
12.5 69
13.1 70
15 70
17 72
22.2 73
22.6 74 There is a pattern (dot pattern is not scattered) evident in the residual plot. This suggests that some other relation, beyond a linear relation
25.8 76 would likely better represent the relation between salary and housing price. Hence a transformation should be
performed
From the scatter plot is can be seen that the relation appears to be positive and strong. However there also
appears to be a slight bend in the dot pattern. Further investigation is needed in order to determine if the
trend is linear
Housing Price ($10,000) 1 2 3 4.5 5.2 8.4 11 12.4 12.5 13.1 15 17 22.2 22.6 25.8 45 47 57 59 60 65 66 67 69 70 70 72 73 74 76 Salary
Housing Price ($10,000)
Transform1
Goal: Make the dot pattern of a scatterplot more nearly linear
calculate transform for the explanatory and the response variables
look at scatterplots of combinations of transformed and/or non-transformed values
Can use Adjusted R-Squared to help chose between candidate linear equation models that result
Non-linear dot patterns in scatterplots and possible transformations to straighten
year salary (in $10,000) LN(salary)
1980 1
1982 2
1990 3
1990 4.5
1991 5.2
1996 8.4
1997 11
1998 12.4
1999 12.5
1999 13.1
1999 15
2001 17
2001 22.2
2004 22.6
2005 25.8
(LN(x), Y)
(x, y2)
(LN(y), LN(x))
(LN(x), Y)
(x, 1/y)
(x, SQRT(y))
(x, LN(y))
(LN(y), LN(x))
Significance
What else to check in the Regression Output
The overall significance of the regression equation (in the ANOVA table)
The regression coefficent is the average change in y expected per unit change in the explanatory variable when all other explanatory variables are held constant
How significant are the regression equation and/or the coefficients?
less than .05 for the significance and/or p-values , 1 or more of the explanatory variables are good predictors of the response variable
also, check R-squared (use adjusted R-squared)
Standard Approach we will use (1 of the following will be the investigation conclusion)
If regression equation is not significant, the model is not a good fit to the data
If the regression model is significant but the coefficient is not, the model provides improved fit over using the expected value of the response variable as the estimated prediction
If the regression model is significant and the coefficient is significant, the model is a good fit for the data and the explanatory variable is contributing significantly towards the quality of prediction
Coefficient significance Model significance
Example2
Is the regression model with the Square Root Transformed response variable salary the better model? Yes according to the analysis (scatter plots, residual plots, adjusted r-squared, and standard error comparisons)
year (actual) year (Explanatory) salary (in $10,000) (Response) year (Explanatory) SQRT(Salary) Compare Scatter Plots : Which looks visually straighter ?
1980 80 1 80 1
1982 82 2 82 1.4142135624
1990 90 3 90 1.7320508076
1990 90 4.5 90 2.1213203436
1991 91 5.2 91 2.2803508502
1996 96 8.4 96 2.8982753492
1997 97 11 97 3.3166247904
1998 98 12.4 98 3.5213633723
1999 99 12.5 99 3.5355339059
1999 99 13.1 99 3.6193922142
1999 99 15 99 3.8729833462
2001 101 17 101 4.1231056256
2001 101 22.2 101 4.7116875958
2004 104 22.6 104 4.7539457296
2005 105 25.8 105 5.0793700397 Compare Residual Plots : Which has a cloudy, scattered dot pattern ?
Response: Salary Response : Square Root of Salary
Compare Adjusted R-squared values : Which is largest ? Compare Standard Error: Which is smallest?
Both models are significant
The regression model with the transformed response variable has the largest Adjusted R-squared value and smallest standard error
Response Variable : Salary Response Variable : Square Root Of Salary
Regression Statistics Regression Statistics
Multiple R 0.9150264296 Multiple R 0.9668776273
R Square 0.8372733668 R Square 0.9348523462
Adjusted R Square 0.8247559335 Adjusted R Square 0.9298409882
Standard Error 3.28038926 Standard Error 0.3337444496
Observations 15 Observations 15
ANOVA ANOVA
df SS MS F Significance F df SS MS F Significance F
Regression 1 719.7849352684 719.7849352684 66.8885821383 0.0000017547 Regression 1 20.7785720521 20.7785720521 186.5467103404 0.0000000044
Residual 13 139.892398065 10.7609536973 Residual 13 1.4480096496 0.1113853577
Total 14 859.6773333333 Total 14 22.2265817017
Coefficients Standard Error t Stat P-value Lower 95\% Upper 95\% Lower 95.0\% Upper 95.0\% Coefficients Standard Error t Stat P-value Lower 95\% Upper 95\% Lower 95.0\% Upper 95.0\%
Intercept -80.4842432619 11.3048781078 -7.1194260119 0.0000078229 -104.906947591 -56.0615389328 -104.906947591 -56.0615389328 Intercept -12.4661593835 1.1501501874 -10.8387230813 0.0000000701 -14.9509077987 -9.9814109684 -14.9509077987 -9.9814109684
year (Explanatory) 0.9657567381 0.1180841892 8.1785440109 0.0000017547 0.710651357 1.2208621192 0.710651357 1.2208621192 year (Explanatory) 0.164087017 0.0120138007 13.6582103637 0.0000000044 0.1381327785 0.1900412554 0.1381327785 0.1900412554
salary (in $10,000) (Response) 80 82 90 90 91 96 97 98 99 99 99 101 101 104 105 1 2 3 4.5 5.2 8.4 11 12.4 12.5 13.1 15 17 22.2 22.6 25.8
SQRT(Salary) 80 82 90 90 91 96 97 98 99 99 99 101 101 104 105 1 1.4142135623730951 1.7320508075688772 2.1213203435596424 2.2803508501982761 2.8982753492378879 3.3166247903553998 3.5213633723318019 3.5355339059327378 3.6193922141707713 3.872983346207417 4.1231056256176606 4.7116875957558984 4.7539457296018854 5.0793700396801178
regression equation after trans
To undo the LN transformation to determine a regression equation for the original salary units, take the exponential (e) of both sides of the equation
To Undo the LN transformations
(x, LN(Y)) LN(Y) = ax + b y = e(ax + b)
y = eax (eb)
(LN(x), y) y = aLN(x) + b
(LN(x), LN(y)) LN(y) = aLN(x) + b LN(y) = aLN(x) + b
LN(y) = LN(xa) + b
y = xa + eb
numerical summaries
Numerical Summaries: Excel Regression Output Table (color coded)
Regression Coefficients Excel Regression Output Table
y-intercept: average value of the response variable when all explanatory variables are 0 Regression Statistics
impact of explanatory variables are not considered Multiple R 0.9150264296
R Square 0.8372733668
regression coefficients: the average predicted change in the response variable per unit change in the explanatory variable Adjusted R Square 0.8247559335
Standard Error 3.28038926
Correlation Coefficient: Strength of Linear Relation Observations 15
Coefficient of Determination (r^2) - Use Adjusted R^2 ANOVA
fraction of the variation in the data accounted for by the regression equation df SS MS F Significance F
sometimes reported as a percentage Regression 1 719.7849352684 719.7849352684 66.8885821383 0.0000017547
between 0 and 1 Residual 13 139.892398065 10.7609536973
Total 14 859.6773333333
SEE (Standard Error of the Estimate)
standard deviation of the residuals Coefficients Standard Error t Stat P-value Lower 95\% Upper 95\% Lower 95.0\% Upper 95.0\%
how spread out the observations are from the regression line Intercept -80.4842432619 11.3048781078 -7.1194260119 0.0000078229 -104.906947591 -56.0615389328 -104.906947591 -56.0615389328
year (Explanatory) 0.9657567381 0.1180841892 8.1785440109 0.0000017547 0.710651357 1.2208621192 0.710651357 1.2208621192
Confidence Intervals : Range, withing specified level of confidence, that the population parameter resides
Influential Points
Influential Points
Does the regression equation change significantly when the isolated points is removed?
Would a different result from the analysis be concluded?
If yes, then the point is influential
In the regression output look at:
significance of model or coefficients
numerical summaries
Example
with influencial point without influential point
recode
Recode multiple category variables (3 or more categories) to 2 category variable
Gender: male (1) female (0)
Political Party (0) republican, (1) democrat, (2) libertarian, (3) green
Student IQ Study hours Gender Political Party Test score
1 110 40 1 0 100
2 110 40 0 2 95
3 120 30 1 1 90
4 110 40 1 3 85
5 100 20 0 3 80
6 110 40 1 1 75
7 90 0 0 0 70
8 110 40 0 2 65
9 80 10 1 3 60
10 80 10 0 1 55
The dummy variables are 1 when (zero otherwise):
x1: republican
x2: democrat
x3: libertarian
Student IQ Study hours Gender x1 x2 x3 Test score
1 110 40 1 1 0 0 100
2 110 40 0 0 0 1 95
3 120 30 1 0 1 0 90
4 110 40 1 0 0 0 85
5 100 20 0 80
6 110 40 1 75
7 90 0 0 70
8 110 40 0 65
9 80 10 1 60
10 80 10 0 55
CATEGORIES
Economics
Nursing
Applied Sciences
Psychology
Science
Management
Computer Science
Human Resource Management
Accounting
Information Systems
English
Anatomy
Operations Management
Sociology
Literature
Education
Business & Finance
Marketing
Engineering
Statistics
Biology
Political Science
Reading
History
Financial markets
Philosophy
Mathematics
Law
Criminal
Architecture and Design
Government
Social Science
World history
Chemistry
Humanities
Business Finance
Writing
Programming
Telecommunications Engineering
Geography
Physics
Spanish
ach
e. Embedded Entrepreneurship
f. Three Social Entrepreneurship Models
g. Social-Founder Identity
h. Micros-enterprise Development
Outcomes
Subset 2. Indigenous Entrepreneurship Approaches (Outside of Canada)
a. Indigenous Australian Entrepreneurs Exami
Calculus
(people influence of
others) processes that you perceived occurs in this specific Institution Select one of the forms of stratification highlighted (focus on inter the intersectionalities
of these three) to reflect and analyze the potential ways these (
American history
Pharmacology
Ancient history
. Also
Numerical analysis
Environmental science
Electrical Engineering
Precalculus
Physiology
Civil Engineering
Electronic Engineering
ness Horizons
Algebra
Geology
Physical chemistry
nt
When considering both O
lassrooms
Civil
Probability
ions
Identify a specific consumer product that you or your family have used for quite some time. This might be a branded smartphone (if you have used several versions over the years)
or the court to consider in its deliberations. Locard’s exchange principle argues that during the commission of a crime
Chemical Engineering
Ecology
aragraphs (meaning 25 sentences or more). Your assignment may be more than 5 paragraphs but not less.
INSTRUCTIONS:
To access the FNU Online Library for journals and articles you can go the FNU library link here:
https://www.fnu.edu/library/
In order to
n that draws upon the theoretical reading to explain and contextualize the design choices. Be sure to directly quote or paraphrase the reading
ce to the vaccine. Your campaign must educate and inform the audience on the benefits but also create for safe and open dialogue. A key metric of your campaign will be the direct increase in numbers.
Key outcomes: The approach that you take must be clear
Mechanical Engineering
Organic chemistry
Geometry
nment
Topic
You will need to pick one topic for your project (5 pts)
Literature search
You will need to perform a literature search for your topic
Geophysics
you been involved with a company doing a redesign of business processes
Communication on Customer Relations. Discuss how two-way communication on social media channels impacts businesses both positively and negatively. Provide any personal examples from your experience
od pressure and hypertension via a community-wide intervention that targets the problem across the lifespan (i.e. includes all ages).
Develop a community-wide intervention to reduce elevated blood pressure and hypertension in the State of Alabama that in
in body of the report
Conclusions
References (8 References Minimum)
*** Words count = 2000 words.
*** In-Text Citations and References using Harvard style.
*** In Task section I’ve chose (Economic issues in overseas contracting)"
Electromagnetism
w or quality improvement; it was just all part of good nursing care. The goal for quality improvement is to monitor patient outcomes using statistics for comparison to standards of care for different diseases
e a 1 to 2 slide Microsoft PowerPoint presentation on the different models of case management. Include speaker notes... .....Describe three different models of case management.
visual representations of information. They can include numbers
SSAY
ame workbook for all 3 milestones. You do not need to download a new copy for Milestones 2 or 3. When you submit Milestone 3
pages):
Provide a description of an existing intervention in Canada
making the appropriate buying decisions in an ethical and professional manner.
Topic: Purchasing and Technology
You read about blockchain ledger technology. Now do some additional research out on the Internet and share your URL with the rest of the class
be aware of which features their competitors are opting to include so the product development teams can design similar or enhanced features to attract more of the market. The more unique
low (The Top Health Industry Trends to Watch in 2015) to assist you with this discussion.
https://youtu.be/fRym_jyuBc0
Next year the $2.8 trillion U.S. healthcare industry will finally begin to look and feel more like the rest of the business wo
evidence-based primary care curriculum. Throughout your nurse practitioner program
Vignette
Understanding Gender Fluidity
Providing Inclusive Quality Care
Affirming Clinical Encounters
Conclusion
References
Nurse Practitioner Knowledge
Mechanics
and word limit is unit as a guide only.
The assessment may be re-attempted on two further occasions (maximum three attempts in total). All assessments must be resubmitted 3 days within receiving your unsatisfactory grade. You must clearly indicate “Re-su
Trigonometry
Article writing
Other
5. June 29
After the components sending to the manufacturing house
1. In 1972 the Furman v. Georgia case resulted in a decision that would put action into motion. Furman was originally sentenced to death because of a murder he committed in Georgia but the court debated whether or not this was a violation of his 8th amend
One of the first conflicts that would need to be investigated would be whether the human service professional followed the responsibility to client ethical standard. While developing a relationship with client it is important to clarify that if danger or
Ethical behavior is a critical topic in the workplace because the impact of it can make or break a business
No matter which type of health care organization
With a direct sale
During the pandemic
Computers are being used to monitor the spread of outbreaks in different areas of the world and with this record
3. Furman v. Georgia is a U.S Supreme Court case that resolves around the Eighth Amendments ban on cruel and unsual punishment in death penalty cases. The Furman v. Georgia case was based on Furman being convicted of murder in Georgia. Furman was caught i
One major ethical conflict that may arise in my investigation is the Responsibility to Client in both Standard 3 and Standard 4 of the Ethical Standards for Human Service Professionals (2015). Making sure we do not disclose information without consent ev
4. Identify two examples of real world problems that you have observed in your personal
Summary & Evaluation: Reference & 188. Academic Search Ultimate
Ethics
We can mention at least one example of how the violation of ethical standards can be prevented. Many organizations promote ethical self-regulation by creating moral codes to help direct their business activities
*DDB is used for the first three years
For example
The inbound logistics for William Instrument refer to purchase components from various electronic firms. During the purchase process William need to consider the quality and price of the components. In this case
4. A U.S. Supreme Court case known as Furman v. Georgia (1972) is a landmark case that involved Eighth Amendment’s ban of unusual and cruel punishment in death penalty cases (Furman v. Georgia (1972)
With covid coming into place
In my opinion
with
Not necessarily all home buyers are the same! When you choose to work with we buy ugly houses Baltimore & nationwide USA
The ability to view ourselves from an unbiased perspective allows us to critically assess our personal strengths and weaknesses. This is an important step in the process of finding the right resources for our personal learning style. Ego and pride can be
· By Day 1 of this week
While you must form your answers to the questions below from our assigned reading material
CliftonLarsonAllen LLP (2013)
5 The family dynamic is awkward at first since the most outgoing and straight forward person in the family in Linda
Urien
The most important benefit of my statistical analysis would be the accuracy with which I interpret the data. The greatest obstacle
From a similar but larger point of view
4 In order to get the entire family to come back for another session I would suggest coming in on a day the restaurant is not open
When seeking to identify a patient’s health condition
After viewing the you tube videos on prayer
Your paper must be at least two pages in length (not counting the title and reference pages)
The word assimilate is negative to me. I believe everyone should learn about a country that they are going to live in. It doesnt mean that they have to believe that everything in America is better than where they came from. It means that they care enough
Data collection
Single Subject Chris is a social worker in a geriatric case management program located in a midsize Northeastern town. She has an MSW and is part of a team of case managers that likes to continuously improve on its practice. The team is currently using an
I would start off with Linda on repeating her options for the child and going over what she is feeling with each option. I would want to find out what she is afraid of. I would avoid asking her any “why” questions because I want her to be in the here an
Summarize the advantages and disadvantages of using an Internet site as means of collecting data for psychological research (Comp 2.1) 25.0\% Summarization of the advantages and disadvantages of using an Internet site as means of collecting data for psych
Identify the type of research used in a chosen study
Compose a 1
Optics
effect relationship becomes more difficult—as the researcher cannot enact total control of another person even in an experimental environment. Social workers serve clients in highly complex real-world environments. Clients often implement recommended inte
I think knowing more about you will allow you to be able to choose the right resources
Be 4 pages in length
soft MB-920 dumps review and documentation and high-quality listing pdf MB-920 braindumps also recommended and approved by Microsoft experts. The practical test
g
One thing you will need to do in college is learn how to find and use references. References support your ideas. College-level work must be supported by research. You are expected to do that for this paper. You will research
Elaborate on any potential confounds or ethical concerns while participating in the psychological study 20.0\% Elaboration on any potential confounds or ethical concerns while participating in the psychological study is missing. Elaboration on any potenti
3 The first thing I would do in the family’s first session is develop a genogram of the family to get an idea of all the individuals who play a major role in Linda’s life. After establishing where each member is in relation to the family
A Health in All Policies approach
Note: The requirements outlined below correspond to the grading criteria in the scoring guide. At a minimum
Chen
Read Connecting Communities and Complexity: A Case Study in Creating the Conditions for Transformational Change
Read Reflections on Cultural Humility
Read A Basic Guide to ABCD Community Organizing
Use the bolded black section and sub-section titles below to organize your paper. For each section
Losinski forwarded the article on a priority basis to Mary Scott
Losinksi wanted details on use of the ED at CGH. He asked the administrative resident