Business Analysis

Business Analysis Statistics Test #3

0%

Regression models that don't include duplicated input variables can:

Correct! Wrong!

The correct answer:
Enhance parameter estimates' stability and cut down on overfitting danger.

An analyst is aware that storeId, a categorical predictor, is a key indicator of the objective. To be a workable predictor in the model, store Id has too many levels. In order to treat them as members of the same class level, the analyst wants to consolidate stores. What are the two most efficient solutions to the issue? (Select two.)

Please select 2 correct answers

Correct! Wrong!

Which interpretation of the estimate is accurate?

Correct! Wrong!

The correct answer:
For every $1,000 increase in wage, the likelihood of the incident increases by 1.142.

Regression models with redundant input variables may:

Correct! Wrong!

The correct answer:
Make parameter estimations less stable and make overfitting more likely.

Which sampling techniques are suitable for data partitioning for model evaluation?

Please select 2 correct answers

Correct! Wrong!

The correct answer:
Stratified random sampling without replacement
Simple random sampling without replacement

What is a reasonable separation between training, validation, and testing data to undertake an honest evaluation of a predictive model?

Correct! Wrong!

The correct answer:
Training: 50% Validation: 50% Testing: 0%

Building a model that disproportionately over-represents those cases with an event occuring (e.g., a 50-50 event/non-event split) is a typical strategy for forecasting uncommon events in the LOGISTIprocedure. What issue does this bring up?

Correct! Wrong!

The correct answer:
Only the intercept estimate is biased

Data that were oversampled as a result of an uncommon target are turned into a confusion matrix. Which variables are unaffected by this oversampling?

Correct! Wrong!

The correct answer:
Sensitivity and Specificity

A model is created by an analyst utilizing the LOGISTIC process. The sensitivity and specificity statistics on a validation data set for various cutoff values are now of interest to them. What combination of options and statements will provide these statistics?

Correct! Wrong!

The correct answer:
Score data=valid1 outroc=roc;

Which approach DOES NOT work well for scoring fresh data against a predetermined target in a logistic regression model?

Correct! Wrong!

The correct answer:
Augment the training data set with new observations and rerun the LOGISTIC procedure

Assume that it costs $10 to solicit a non-responder and earns $200 to do so. On an SAS data set called VALID, the logistic regression model produces a probability score called P R. The responder variable Pinch, a 1/0 variable coded as 1 for the responder, can be found in the VALID data set. When a customer's probability score is higher than 0.05, they will be contacted. Which SAS software in the data set calculates the profit for each customer?

Correct! Wrong!

The correct answer:
Profit=(P_R>0.05)*Purch*200-(P_R>.05)*(1-Purch)*10;