就業統計數據是政策制定者用來衡量經濟整體實力的最重要指標之一。在美國,政府使用現有人口調查(CPS)衡量失業率,該調查每月收集來自各種美國人的人口統計和就業信息。在本練習中,我們將使用講座中審查的主題以及一些使用2013年9月版的,具有全國代表性的數據集。數據集中的觀察結果代表2013年9月CPS中實際完成調查的人員,完整數據集有385個欄位,但在本練習中,我們將使用數據集CPSData.csv版本,它具有以下欄位:




Section-1 Loading and Summarizing the Dataset

§ 1.1 How many interviewees are in the dataset?

#
#

§ 1.2 Among the interviewees with a value reported for the Industry variable, what is the most common industry of employment? Please enter the name exactly how you see it.

#
#

§ 1.3 Which state has the fewest interviewees?

#
#

Which state has the largest number of interviewees?

#
#

§ 1.4 What proportion of interviewees are citizens of the United States?

#
#

§ 1.5 For which races are there at least 250 interviewees in the CPS dataset of Hispanic ethnicity? (Select all that apply.)

#
#




Section-2 Evaluating Missing Values

§ 2.1 Which variables have at least one interviewee with a missing (NA) value? (Select all that apply.)

#
#

§ 2.2 Which is the most accurate:

#
#

§ 2.3 How many states had all interviewees living in a non-metropolitan area (aka they have a missing MetroAreaCode value)? For this question, treat the District of Columbia as a state (even though it is not technically a state).

#
#

How many states had all interviewees living in a metropolitan area? Again, treat the District of Columbia as a state.

#
#

§ 2.4 Which region of the United States has the largest proportion of interviewees living in a non-metropolitan area?

#
#

§ 2.5 Which state has a proportion of interviewees living in a non-metropolitan area closest to 30%?

#
#

Which state has the largest proportion of non-metropolitan interviewees, ignoring states where all interviewees were non-metropolitan?

#
#




Section-3 Integrating Metropolitan Area Data

§ 3.1 How many observations (codes for metropolitan areas) are there in MetroAreaMap?

#
#

How many observations (codes for countries) are there in CountryMap?

#
#

§ 3.2 What is the name of the variable that was added to the data frame by the merge() operation?

#
#

How many interviewees have a missing value for the new metropolitan area variable?

#
#

§ 3.3 Which of the following metropolitan areas has the largest number of interviewees?

#
#

§ 3.4 Which metropolitan area has the highest proportion of interviewees of Hispanic ethnicity?

#
#

§ 3.5 Determine the number of metropolitan areas in the United States from which at least 20% of interviewees are Asian.

#
#

§ 3.6 Passing na.rm=TRUE to the tapply function, determine which metropolitan area has the smallest proportion of interviewees who have received no high school diploma.

#
#




Section-4 Integrating Country of Birth Data

§ 4.1 What is the name of the variable added to the CPS data frame by this merge operation?

#
#

How many interviewees have a missing value for the new metropolitan area variable?

#
#

§ 4.2 Among all interviewees born outside of North America, which country was the most common place of birth?

#
#

§ 4.3 What proportion of the interviewees from the “New York-Northern New Jersey-Long Island, NY-NJ-PA” metropolitan area have a country of birth that is not the United States?

#
#

§ 4.4 Which metropolitan area has the largest number (note – not proportion) of interviewees with a country of birth in India?

#
#

In Brazil?

#
#

In Somalia?

#
#