본문 바로가기

추천 검색어

실시간 인기 검색어

An Introduction to Statistical Learning

With Applications in R | 양장본 Hardcover
Springer Texts in Statistics #103
James, Gareth 저자(글)
Springer · 2013년 08월 23일
새로 출시된 개정판이 있습니다. 개정판보기
10.0
10점 중 10점
(1개의 리뷰)
쉬웠어요 (100%의 구매자)
  • An Introduction to Statistical Learning 대표 이미지
    An Introduction to Statistical Learning 대표 이미지
  • A4
    사이즈 비교
    210x297
    An Introduction to Statistical Learning 사이즈 비교 162x238
    단위 : mm
01 / 02
무료배송 소득공제 정가제Free
5% 76,950 81,000
적립/혜택
2,310P

기본적립

3% 적립 2,310P

추가적립

  • 5만원 이상 구매 시 추가 2,000P
  • 3만원 이상 구매 시, 등급별 2~4% 추가 최대 2,310P
  • 리뷰 작성 시, e교환권 추가 최대 300원

알림 신청하시면 원하시는 정보를
받아 보실 수 있습니다.

절판되었습니다.
Provides tools for Statistical Learning that are essential for practitioners in science, industry and other fields
An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform.

Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

원서번역서 내용 엿보기

원서번역서

작가정보

저자(글) James, Gareth

Gareth James is a professor of statistics at University of Southern California. He has published an extensive body of methodological work in the domain of statistical learning with particular emphasis on high-dimensional and functional data. The conceptual framework for this book grew out of his MBA elective courses in this area.
Daniela Witten is an associate professor of biostatistics and statistics at the University of Washington.Her research focuses largely on high-dimensional statistical machine learning. She has contributed to the translation of statistical learning techniques to the field of genomics, through collaborations and as a member of the Institute of Medicine committee that led to the report Evolution of Translational Omics.

Trevor Hastie and Robert Tibshirani are professors of statistics at Stanford University, and are co-authors of the successful textbook Elements of Statistical Learning. Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap.

목차

  • Preface vii
    1 Introduction 1
    2 Statistical Learning 15
    2.1 What Is Statistical Learning? . . . . . . . . . . . . . . . . . 15
    2.1.1 Why Estimate f? . . . . . . . . . . . . . . . . . . . . 17
    2.1.2 How Do We Estimate f? . . . . . . . . . . . . . . . 21
    2.1.3 The Trade-Off Between Prediction Accuracy
    and Model Interpretability . . . . . . . . . . . . . . 24
    2.1.4 Supervised Versus Unsupervised Learning . . . . . . 26
    2.1.5 Regression Versus Classification Problems . . . . . . 28
    2.2 Assessing Model Accuracy . . . . . . . . . . . . . . . . . . . 29
    2.2.1 Measuring the Quality of Fit . . . . . . . . . . . . . 29
    2.2.2 The Bias-Variance Trade-Off . . . . . . . . . . . . . 33
    2.2.3 The Classification Setting . . . . . . . . . . . . . . . 37
    2.3 Lab: Introduction to R . . . . . . . . . . . . . . . . . . . . . 42
    2.3.1 Basic Commands . . . . . . . . . . . . . . . . . . . . 42
    2.3.2 Graphics . . . . . . . . . . . . . . . . . . . . . . . . 45
    2.3.3 Indexing Data . . . . . . . . . . . . . . . . . . . . . 47
    2.3.4 Loading Data . . . . . . . . . . . . . . . . . . . . . . 48
    2.3.5 Additional Graphical and Numerical Summaries . . 49
    2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
    ix
    x Contents
    3 Linear Regression 59
    3.1 Simple Linear Regression . . . . . . . . . . . . . . . . . . . 61
    3.1.1 Estimating the Coefficients . . . . . . . . . . . . . . 61
    3.1.2 Assessing the Accuracy of the Coefficient
    Estimates . . . . . . . . . . . . . . . . . . . . . . . . 63
    3.1.3 Assessing the Accuracy of the Model . . . . . . . . . 68
    3.2 Multiple Linear Regression . . . . . . . . . . . . . . . . . . 71
    3.2.1 Estimating the Regression Coefficients . . . . . . . . 72
    3.2.2 Some Important Questions . . . . . . . . . . . . . . 75
    3.3 Other Considerations in the Regression Model . . . . . . . . 82
    3.3.1 Qualitative Predictors . . . . . . . . . . . . . . . . . 82
    3.3.2 Extensions of the Linear Model . . . . . . . . . . . . 86
    3.3.3 Potential Problems . . . . . . . . . . . . . . . . . . . 92
    3.4 The Marketing Plan . . . . . . . . . . . . . . . . . . . . . . 102
    3.5 Comparison of Linear Regression with K-Nearest
    Neighbors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
    3.6 Lab: Linear Regression . . . . . . . . . . . . . . . . . . . . . 109
    3.6.1 Libraries . . . . . . . . . . . . . . . . . . . . . . . . . 109
    3.6.2 Simple Linear Regression . . . . . . . . . . . . . . . 110
    3.6.3 Multiple Linear Regression . . . . . . . . . . . . . . 113
    3.6.4 Interaction Terms . . . . . . . . . . . . . . . . . . . 115
    3.6.5 Non-linear Transformations of the Predictors . . . . 115
    3.6.6 Qualitative Predictors . . . . . . . . . . . . . . . . . 117
    3.6.7 Writing Functions . . . . . . . . . . . . . . . . . . . 119
    3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
    4 Classification 127
    4.1 An Overview of Classification . . . . . . . . . . . . . . . . . 128
    4.2 Why Not Linear Regression? . . . . . . . . . . . . . . . . . 129
    4.3 Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . 130
    4.3.1 The Logistic Model . . . . . . . . . . . . . . . . . . . 131
    4.3.2 Estimating the Regression Coefficients . . . . . . . . 133
    4.3.3 Making Predictions . . . . . . . . . . . . . . . . . . . 134
    4.3.4 Multiple Logistic Regression . . . . . . . . . . . . . . 135
    4.3.5 Logistic Regression for >2 Response Classes . . . . . 137
    4.4 Linear Discriminant Analysis . . . . . . . . . . . . . . . . . 138
    4.4.1 Using Bayes’ Theorem for Classification . . . . . . . 138
    4.4.2 Linear Discriminant Analysis for p = 1 . . . . . . . . 139
    4.4.3 Linear Discriminant Analysis for p >1 . . . . . . . . 142
    4.4.4 Quadratic Discriminant Analysis . . . . . . . . . . . 149
    4.5 A Comparison of Classification Methods . . . . . . . . . . . 151
    4.6 Lab: Logistic Regression, LDA, QDA, and KNN . . . . . . 154
    4.6.1 The Stock Market Data . . . . . . . . . . . . . . . . 154
    4.6.2 Logistic Regression . . . . . . . . . . . . . . . . . . . 156
    4.6.3 Linear Discriminant Analysis . . . . . . . . . . . . . 161
    Contents xi
    4.6.4 Quadratic Discriminant Analysis . . . . . . . . . . . 162
    4.6.5 K-Nearest Neighbors . . . . . . . . . . . . . . . . . . 163
    4.6.6 An Application to Caravan Insurance Data . . . . . 164
    4.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
    5 Resampling Methods 175
    5.1 Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . 176
    5.1.1 The Validation Set Approach . . . . . . . . . . . . . 176
    5.1.2 Leave-One-Out Cross-Validation . . . . . . . . . . . 178
    5.1.3 k-Fold Cross-Validation . . . . . . . . . . . . . . . . 181
    5.1.4 Bias-Variance Trade-Off for k-Fold
    Cross-Validation . . . . . . . . . . . . . . . . . . . . 183
    5.1.5 Cross-Validation on Classification Problems . . . . . 184
    5.2 The Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . 187
    5.3 Lab: Cross-Validation and the Bootstrap . . . . . . . . . . . 190
    5.3.1 The Validation Set Approach . . . . . . . . . . . . . 191
    5.3.2 Leave-One-Out Cross-Validation . . . . . . . . . . . 192
    5.3.3 k-Fold Cross-Validation . . . . . . . . . . . . . . . . 193
    5.3.4 The Bootstrap . . . . . . . . . . . . . . . . . . . . . 194
    5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
    6 Linear Model Selection and Regularization 203
    6.1 Subset Selection . . . . . . . . . . . . . . . . . . . . . . . . 205
    6.1.1 Best Subset Selection . . . . . . . . . . . . . . . . . 205
    6.1.2 Stepwise Selection . . . . . . . . . . . . . . . . . . . 207
    6.1.3 Choosing the Optimal Model . . . . . . . . . . . . . 210
    6.2 Shrinkage Methods . . . . . . . . . . . . . . . . . . . . . . . 214
    6.2.1 Ridge Regression . . . . . . . . . . . . . . . . . . . . 215
    6.2.2 The Lasso . . . . . . . . . . . . . . . . . . . . . . . . 219
    6.2.3 Selecting the Tuning Parameter . . . . . . . . . . . . 227
    6.3 Dimension Reduction Methods . . . . . . . . . . . . . . . . 228
    6.3.1 Principal Components Regression . . . . . . . . . . . 230
    6.3.2 Partial Least Squares . . . . . . . . . . . . . . . . . 237
    6.4 Considerations in High Dimensions . . . . . . . . . . . . . . 238
    6.4.1 High-Dimensional Data . . . . . . . . . . . . . . . . 238
    6.4.2 What Goes Wrong in High Dimensions? . . . . . . . 239
    6.4.3 Regression in High Dimensions . . . . . . . . . . . . 241
    6.4.4 Interpreting Results in High Dimensions . . . . . . . 243
    6.5 Lab 1: Subset Selection Methods . . . . . . . . . . . . . . . 244
    6.5.1 Best Subset Selection . . . . . . . . . . . . . . . . . 244
    6.5.2 Forward and Backward Stepwise Selection . . . . . . 247
    6.5.3 Choosing Among Models Using the Validation
    Set Approach and Cross-Validation . . . . . . . . . . 248
    xii Contents
    6.6 Lab 2: Ridge Regression and the Lasso . . . . . . . . . . . . 251
    6.6.1 Ridge Regression . . . . . . . . . . . . . . . . . . . . 251
    6.6.2 The Lasso . . . . . . . . . . . . . . . . . . . . . . . . 255
    6.7 Lab 3: PCR and PLS Regression . . . . . . . . . . . . . . . 256
    6.7.1 Principal Components Regression . . . . . . . . . . . 256
    6.7.2 Partial Least Squares . . . . . . . . . . . . . . . . . 258
    6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
    7 Moving Beyond Linearity 265
    7.1 Polynomial Regression . . . . . . . . . . . . . . . . . . . . . 266
    7.2 Step Functions . . . . . . . . . . . . . . . . . . . . . . . . . 268
    7.3 Basis Functions . . . . . . . . . . . . . . . . . . . . . . . . . 270
    7.4 Regression Splines . . . . . . . . . . . . . . . . . . . . . . . 271
    7.4.1 Piecewise Polynomials . . . . . . . . . . . . . . . . . 271
    7.4.2 Constraints and Splines . . . . . . . . . . . . . . . . 271
    7.4.3 The Spline Basis Representation . . . . . . . . . . . 273
    7.4.4 Choosing the Number and Locations
    of the Knots . . . . . . . . . . . . . . . . . . . . . . 274
    7.4.5 Comparison to Polynomial Regression . . . . . . . . 276
    7.5 Smoothing Splines . . . . . . . . . . . . . . . . . . . . . . . 277
    7.5.1 An Overview of Smoothing Splines . . . . . . . . . . 277
    7.5.2 Choosing the Smoothing Parameter λ . . . . . . . . 278
    7.6 Local Regression . . . . . . . . . . . . . . . . . . . . . . . . 280
    7.7 Generalized Additive Models . . . . . . . . . . . . . . . . . 282
    7.7.1 GAMs for Regression Problems . . . . . . . . . . . . 283
    7.7.2 GAMs for Classification Problems . . . . . . . . . . 286
    7.8 Lab: Non-linear Modeling . . . . . . . . . . . . . . . . . . . 287
    7.8.1 Polynomial Regression and Step Functions . . . . . 288
    7.8.2 Splines . . . . . . . . . . . . . . . . . . . . . . . . . . 293
    7.8.3 GAMs . . . . . . . . . . . . . . . . . . . . . . . . . . 294
    7.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
    8 Tree-Based Methods 303
    8.1 The Basics of Decision Trees . . . . . . . . . . . . . . . . . 303
    8.1.1 Regression Trees . . . . . . . . . . . . . . . . . . . . 304
    8.1.2 Classification Trees . . . . . . . . . . . . . . . . . . . 311
    8.1.3 Trees Versus Linear Models . . . . . . . . . . . . . . 314
    8.1.4 Advantages and Disadvantages of Trees . . . . . . . 315
    8.2 Bagging, Random Forests, Boosting . . . . . . . . . . . . . 316
    8.2.1 Bagging . . . . . . . . . . . . . . . . . . . . . . . . . 316
    8.2.2 Random Forests . . . . . . . . . . . . . . . . . . . . 320
    8.2.3 Boosting . . . . . . . . . . . . . . . . . . . . . . . . . 321
    8.3 Lab: Decision Trees . . . . . . . . . . . . . . . . . . . . . . . 324
    8.3.1 Fitting Classification Trees . . . . . . . . . . . . . . 324
    8.3.2 Fitting Regression Trees . . . . . . . . . . . . . . . . 327
    Contents xiii
    8.3.3 Bagging and Random Forests . . . . . . . . . . . . . 328
    8.3.4 Boosting . . . . . . . . . . . . . . . . . . . . . . . . . 330
    8.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
    9 Support Vector Machines 337
    9.1 Maximal Margin Classifier . . . . . . . . . . . . . . . . . . . 338
    9.1.1 What Is a Hyperplane? . . . . . . . . . . . . . . . . 338
    9.1.2 Classification Using a Separating Hyperplane . . . . 339
    9.1.3 The Maximal Margin Classifier . . . . . . . . . . . . 341
    9.1.4 Construction of the Maximal Margin Classifier . . . 342
    9.1.5 The Non-separable Case . . . . . . . . . . . . . . . . 343
    9.2 Support Vector Classifiers . . . . . . . . . . . . . . . . . . . 344
    9.2.1 Overview of the Support Vector Classifier . . . . . . 344
    9.2.2 Details of the Support Vector Classifier . . . . . . . 345
    9.3 Support Vector Machines . . . . . . . . . . . . . . . . . . . 349
    9.3.1 Classification with Non-linear Decision
    Boundaries . . . . . . . . . . . . . . . . . . . . . . . 349
    9.3.2 The Support Vector Machine . . . . . . . . . . . . . 350
    9.3.3 An Application to the Heart Disease Data . . . . . . 354
    9.4 SVMs with More than Two Classes . . . . . . . . . . . . . . 355
    9.4.1 One-Versus-One Classification . . . . . . . . . . . . . 355
    9.4.2 One-Versus-All Classification . . . . . . . . . . . . . 356
    9.5 Relationship to Logistic Regression . . . . . . . . . . . . . . 356
    9.6 Lab: Support Vector Machines . . . . . . . . . . . . . . . . 359
    9.6.1 Support Vector Classifier . . . . . . . . . . . . . . . 359
    9.6.2 Support Vector Machine . . . . . . . . . . . . . . . . 363
    9.6.3 ROC Curves . . . . . . . . . . . . . . . . . . . . . . 365
    9.6.4 SVM with Multiple Classes . . . . . . . . . . . . . . 366
    9.6.5 Application to Gene Expression Data . . . . . . . . 366
    9.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
    10 Unsupervised Learning 373
    10.1 The Challenge of Unsupervised Learning . . . . . . . . . . . 373
    10.2 Principal Components Analysis . . . . . . . . . . . . . . . . 374
    10.2.1 What Are Principal Components? . . . . . . . . . . 375
    10.2.2 Another Interpretation of Principal Components . . 379
    10.2.3 More on PCA . . . . . . . . . . . . . . . . . . . . . . 380
    10.2.4 Other Uses for Principal Components . . . . . . . . 385
    10.3 Clustering Methods . . . . . . . . . . . . . . . . . . . . . . . 385
    10.3.1 K-Means Clustering . . . . . . . . . . . . . . . . . . 386
    10.3.2 Hierarchical Clustering . . . . . . . . . . . . . . . . . 390
    10.3.3 Practical Issues in Clustering . . . . . . . . . . . . . 399
    10.4 Lab 1: Principal Components Analysis . . . . . . . . . . . . 401
    xiv Contents
    10.5 Lab 2: Clustering . . . . . . . . . . . . . . . . . . . . . . . . 404
    10.5.1 K-Means Clustering . . . . . . . . . . . . . . . . . . 404
    10.5.2 Hierarchical Clustering . . . . . . . . . . . . . . . . . 406
    10.6 Lab 3: NCI60 Data Example . . . . . . . . . . . . . . . . . 407
    10.6.1 PCA on the NCI60 Data . . . . . . . . . . . . . . . 408
    10.6.2 Clustering the Observations of the NCI60 Data . . . 410
    10.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
    Index 419

출판사 서평

“This book by James, Witten, Hastie, and Tibshirani was a great pleasure to read, and I was extremely surprised by it and the available material. In my opinion, it is the best book for teaching statistical learning approaches to undergraduate and master students in statistics. … All in all, this is a great textbook for teaching an introductory course in statistical learning. … In my opinion, there is no better book for teaching modern statistical learning at the introductory level.” (Andreas Ziegler, Biometrical Journal, Vol. 58 (3), May, 2016)

“This book has a very strong advantage that sets it well ahead of the competition when it comes to learning about machine learning: it covers all of the necessary details that one has to know in order to apply or implement a machine learning algorithm in a real-world problem. Hence, this book will definitely be of interest to readers from many fields, ranging from computer science to business administration and marketing.” (Charalambos Poullis, Computing Reviews, September, 2014)

“The book provides a good introduction to R. The code for all the statistical methods introduced in the book is carefully explained. … the book will certainly be useful to many people (including me). I will surely use many examples, labs and datasets from this book in my own lectures.” (Pierre Alquier, Mathematical Reviews, July, 2014)

“The stated purpose of this book is to facilitate the transition of statistical learning to mainstream. … it adds information by including more detail and R code to some of the topics in Elements of Statistical Learning. … I am having a lot of fun playing with the code that goes with book. I am glad that this was written.” (Mary Anne, Cats and Dogs with Data, maryannedata.com, June, 2014)

“This book (ISL) is a great Master’s level introduction to statistical learning: statistics for complex datasets. … the homework problems in ISL are at a Master’s level for students who want to learn how to use statistical learning methods to analyze data. … ISL contains 12 very valuable R labs that show how to use many of the statistical learning methods with the R package ISLR … .” (David Olive, Technometrics, Vol. 56 (2), May, 2014)

“Written by four experts of the field, this book offers an excellent entry to statistical learning to a broad audience, including those without strong background in mathematics. … The end-of-chapter exercises make the book an ideal text for both classroom learning and self-study. … The book is suitable for anyone interested in using statistical learning tools to analyze data. It can be used as a textbook for advanced undergraduate and master’s students in statistics or related quantitative fields.” (Jianhua Z. Huang, Journal of Agricultural, Biological, and Environmental Statistics, Vol. 19, 2014)

기본정보

상품정보 테이블로 ISBN, 발행(출시)일자 , 쪽수, 크기, 언어, 시리즈명을(를) 나타낸 표입니다.
ISBN 9781461471370 ( 1461471370 )
발행(출시)일자 2013년 08월 23일
쪽수 준비중
크기
162 * 238 * 25 mm
언어 영어
시리즈명
Springer Texts in Statistics #103

Klover 리뷰 (1)

구매 후 리뷰 작성 시, e교환권 200원 적립

10점 중 10점
/쉬웠어요
내용도 알차고 좋습니다.

문장수집 (0)

문장수집 안내
문장수집은 고객님들이 직접 선정한 책의 좋은 문장을 보여주는 교보문고의 새로운 서비스입니다. 마음을 두드린 문장들을 기록하고 좋은 글귀들은 "좋아요“ 하여 모아보세요. 도서 문장과 무관한 내용 등록 시 별도 통보 없이 삭제될 수 있습니다.
리워드 안내
구매 후 90일 이내에 문장수집 작성 시 e교환권 100원을 적립해드립니다.
e교환권은 적립 일로부터 180일 동안 사용 가능합니다. 리워드는 작성 후 다음 날 제공되며, 발송 전 작성 시 발송 완료 후 익일 제공됩니다.
리워드는 한 상품에 최초 1회만 제공됩니다.
주문취소/반품/절판/품절 시 리워드 대상에서 제외됩니다.
판매가 5,000원 미만 상품의 경우 리워드 지급 대상에서 제외됩니다. (2024년 9월 30일부터 적용)

구매 후 리뷰 작성 시, e교환권 100원 적립

이 책의 첫 기록을 남겨주세요.

교환/반품/품절 안내

    상품 설명에 반품/교환 관련한 안내가 있는 경우 그 내용을 우선으로 합니다. (업체 사정에 따라 달라질 수 있습니다.)

    마침내 특이점이 시작된다
    이벤트
    • 2507 교보 럭키 썸머
    • 만화, 또렷해지다
    01 / 02
    TOP