Business Analysis Statistics Test #1

A linear regression's usual formula is Y= beta0+beta1*X+error. Which phrase best encapsulates the presumptions made about the mistakes?

The errors are independent, normally distributed with constant mean and zero variance.

The errors are correlated, normally distributed with zero mean and constant variance.

The errors are independent, normally distributed with zero mean and constant variance.

The errors are correlated, normally distributed with constant mean and zero variance.

Correct! Wrong!

The correct answer:
The errors are independent, normally distributed with zero mean and constant variance.

Which SAS application will split the original data set into training and validation data sets stratified by county, each comprising 60% of the data?

Proc surveryselect data=SASUSER.DATABASE samprate=0.6 out=sample;strata country;run;

Proc sort data=SASUSER.DATABASE;by county;run;proc surveyselect data=SASUSER.DATABASE samprate=0.6 out=sample outall;strata county;run;

Proc sort data=SASUSER.DATABASE;by county;run;proc surveyselect data=SASUSER.DATABASE samprate=0.6 out=sample;strata county;eun;

Proc sort data=SASUSER.DATABASE;by county;run;proc surveyselect data=SASUSER.DATABASE samprate=0.6 out=sample outall;run;

Correct! Wrong!

The correct answer:
Proc sort data=SASUSER.DATABASE;by county;run;proc surveyselect data=SASUSER.DATABASE samprate=0.6 out=sample outall;strata county;run;

A logistic regression model's input variable, Region (A, B, or C), is investigated by an analyst. The analyst finds that when Region = A, the likelihood of purchasing a specific item is 1. What issue does this highlight?

Problems that arise due to missing values

Quasi-complete separation

Collinearity

Influential observations

Correct! Wrong!

The correct answer:
When the dependent variable partially or partially completely separates an independent variable or a mixture of numerous independent variables, this is known as quasi-complete separation. In a discrete outcome variable, levels in a category variable or values in a numeric variable are separated by groups.

The mean incomes of men and women employed by a corporation are compared by an analyst. Variables in the SAS data collection SALARY include: Gender (M or F) Pay (dollars per year) What SAS tools may be used to calculate the p-value when comparing the wages of men and women? (Select two.)

Please select 2 correct answers

Proc ttest data=salary;class gender;var pay;run;

Proc ttest data=salary;class gender;model pay=gender;run;

Proc glm data=salary;class pay;model pay=gender;run;

Proc glm data=salary;class gender;model pay=gender;run;

Correct! Wrong!

The correct answer:
Proc ttest data=salary;class gender;var pay;run;
Proc glm data=salary;class gender;model pay=gender;run;

Which statistic from a validation sample can be used to choose the model to employ for a binary target variable's prediction?

Average Squared Error

Adjusted R Square

Chi Square

Mallow's Cp

Correct! Wrong!

The correct answer:
The model with the lowest average squared error value is the one that is selected. The model with the lowest mean squared error value is the one that is chosen.

Training, validation, and test data have been separated from the entire modeling data. Which data are most suitable for model evaluation?

Test data

Training data

Total data

Validation data

Correct! Wrong!

The correct answer:
Data scientists can assess how successfully the model produces predictions based on the new data using validation data, which serves as the initial test against unobserved data. Validation data is not often used by data scientists, but it might offer some useful information for optimizing the hyperparameters that affect how the model evaluates data.

Which statistic, when applied to a larger model, suggests a better model?

Adjusted R Square

Mallow's Cp

Average Squared Error

Chi Square

Correct! Wrong!

The correct answer:
The changed R-squared is a variant of R-squared that takes into account factors in a regression model that is not significant. In other words, the adjusted R-squared demonstrates whether or not a regression model is improved by including more factors.

Which of the following best defines a pair of observations that are incongruent in the LOGISTIC process?

There is an equal chance that one observation will be associated with the occurrence as another.

In comparison to an observation without the event, an observation with the event has a higher projected probability.

In comparison to an observation without the event, an observation with the event has a lower anticipated probability.

The anticipated probability for an observation with the event is the same as for an observation without the event.

Correct! Wrong!

The correct answer:
In comparison to an observation without the event, an observation with the event has a lower anticipated probability.

This model has been chosen as the winner by an analyst since it outperforms a rival model with more predictors in terms of model fit. Which statistic supports this argument?

R-Square

Coeff Var

Error DF

Adj R-Sq

Correct! Wrong!

The correct answer:
Is a corrected model accuracy (goodness-of-fit) metric for linear models. It shows how much of the volatility in the target field can be attributed to the input or inputs.

What is the best way to handle mean imputation when it is applied to data that has already been partitioned for an accurate assessment?

The validation and test data sets are subjected to the sample means from the training data set.

Each data partition's sample means are applied to that particular partition.

The training and validation data sets are adjusted using the sample means from the test data set.

The training and test data sets are subjected to the sample means from the validation data set.

Correct! Wrong!

The correct answer:
The validation and test data sets are subjected to the sample means from the training data set.

A financial services manager is attempting to determine the likelihood that specific customers will not pay off their home equity line of credit (HELOC). The code below was left by a previous employee. A similar data set of more recent clients is called RECENT HELOC, while the training data set is called HELOC. Which SAS data procedures will determine the anticipated likelihood of client default for recent clients? (Select two.) insert here>; data new prob; set scored heloc; run;

Please select 2 correct answers

P=default/(1+default);

Odds=exp(default);p=odds/1+odds;

P=1/(1+exp(-default));

P=(1+exp(default))/exp(default);

Correct! Wrong!

The correct answer:
Odds=exp(default);p=odds/1+odds;
P=1/(1+exp(-default));

Business Analysis Statistics Test #1

A linear regression's usual formula is Y= beta0+beta1*X+error. Which phrase best encapsulates the presumptions made about the mistakes?

Which SAS application will split the original data set into training and validation data sets stratified by county, each comprising 60% of the data?

A logistic regression model's input variable, Region (A, B, or C), is investigated by an analyst. The analyst finds that when Region = A, the likelihood of purchasing a specific item is 1. What issue does this highlight?

The mean incomes of men and women employed by a corporation are compared by an analyst. Variables in the SAS data collection SALARY include: Gender (M or F) Pay (dollars per year) What SAS tools may be used to calculate the p-value when comparing the wages of men and women? (Select two.)

Which statistic from a validation sample can be used to choose the model to employ for a binary target variable's prediction?

Training, validation, and test data have been separated from the entire modeling data. Which data are most suitable for model evaluation?

Which statistic, when applied to a larger model, suggests a better model?

Which of the following best defines a pair of observations that are incongruent in the LOGISTIC process?

This model has been chosen as the winner by an analyst since it outperforms a rival model with more predictors in terms of model fit. Which statistic supports this argument?

What is the best way to handle mean imputation when it is applied to data that has already been partitioned for an accurate assessment?

Business Analysis Practice Test #5

Certified Business Analysis Professional Test #5

Business Analysis Statistics Test #2

Business Analysis Practice Test #2

Business Analysis Statistics Test #6

Certified Business Analysis Professional Test #6

A linear regression's usual formula is Y= beta0+beta1*X+error. Which phrase best encapsulates the presumptions made about the mistakes?

Which SAS application will split the original data set into training and validation data sets stratified by county, each comprising 60% of the data?

A logistic regression model's input variable, Region (A, B, or C), is investigated by an analyst. The analyst finds that when Region = A, the likelihood of purchasing a specific item is 1. What issue does this highlight?

The mean incomes of men and women employed by a corporation are compared by an analyst. Variables in the SAS data collection SALARY include: Gender (M or F) Pay (dollars per year) What SAS tools may be used to calculate the p-value when comparing the wages of men and women? (Select two.)

Which statistic from a validation sample can be used to choose the model to employ for a binary target variable's prediction?

Training, validation, and test data have been separated from the entire modeling data. Which data are most suitable for model evaluation?

Which statistic, when applied to a larger model, suggests a better model?

Which of the following best defines a pair of observations that are incongruent in the LOGISTIC process?

This model has been chosen as the winner by an analyst since it outperforms a rival model with more predictors in terms of model fit. Which statistic supports this argument?

What is the best way to handle mean imputation when it is applied to data that has already been partitioned for an accurate assessment?

Premium Tests $49/moFREE April-2024

Premium Tests $49/mo
FREE April-2024