Mastering Spark with R : the complete guide to large-scale analysis and modeling

Book Cover
Average Rating
Published
Sebastopol, CA : O'Reilly Media, [2019].
Status
Available Online

Description

Loading Description...

More Details

Format
Edition
First edition.
Language
English
ISBN
9781492046349, 1492046345, 9781492046325, 1492046329

Notes

Bibliography
Includes bibliographical references and index.
Description
"Authors Javier Luraschi, Kevin Kuo, and Edgar Ruiz show you how to combine R with Spark to analyze data at scale. This book covers relevant data science topics, cluster computing, and issues that will interest even the most advanced users."--Back cover
Local note
O'Reilly,O'Reilly Online Learning: Academic/Public Library Edition

Also in this Series

Checking series information...

More Like This

Loading more titles like this title...

Reviews from GoodReads

Loading GoodReads Reviews.

Citations

APA Citation, 7th Edition (style guide)

Luraschi, J., Kuo, K., & Ruiz, E. (2019). Mastering Spark with R: the complete guide to large-scale analysis and modeling (First edition.). O'Reilly Media.

Chicago / Turabian - Author Date Citation, 17th Edition (style guide)

Luraschi, Javier, Kevin, Kuo and Edgar, Ruiz. 2019. Mastering Spark With R: The Complete Guide to Large-scale Analysis and Modeling. O'Reilly Media.

Chicago / Turabian - Humanities (Notes and Bibliography) Citation, 17th Edition (style guide)

Luraschi, Javier, Kevin, Kuo and Edgar, Ruiz. Mastering Spark With R: The Complete Guide to Large-scale Analysis and Modeling O'Reilly Media, 2019.

MLA Citation, 9th Edition (style guide)

Luraschi, Javier,, Kevin Kuo, and Edgar Ruiz. Mastering Spark With R: The Complete Guide to Large-scale Analysis and Modeling First edition., O'Reilly Media, 2019.

Note! Citations contain only title, author, edition, publisher, and year published. Citations should be used as a guideline and should be double checked for accuracy. Citation formats are based on standards as of August 2021.

Staff View

Grouped Work ID
81a2edf0-7894-1d05-423c-55050374f7f5-eng
Go To Grouped Work View in Staff Client

Grouping Information

Grouped Work ID81a2edf0-7894-1d05-423c-55050374f7f5-eng
Full titlemastering spark with r the complete guide to large scale analysis and modeling
Authorluraschi javier
Grouping Categorybook
Last Update2024-04-12 23:07:04PM
Last Indexed2024-04-17 02:22:30AM

Book Cover Information

Image SourcecontentCafe
First LoadedJun 8, 2022
Last UsedMay 1, 2024

Marc Record

First DetectedMar 21, 2023 01:15:13 PM
Last File Modification TimeMar 21, 2023 01:15:13 PM
SuppressedRecord had no items

MARC Record

LEADER05844cam a2200637 i 4500
001on1123174078
003OCoLC
00520230321131244.0
006m     o  d        
007cr unu||||||||
008191015t20192020caua    ob    001 0 eng d
019 |a 1122917346|a 1131767521
020 |a 9781492046349|q (electronic bk.)
020 |a 1492046345|q (electronic bk.)
020 |a 9781492046325
020 |a 1492046329
035 |a (OCoLC)1123174078|z (OCoLC)1122917346|z (OCoLC)1131767521
037 |a CL0501000076|b Safari Books Online
040 |a UMI|b eng|e rda|e pn|c UMI|d OCLCF|d CDN|d EBLCP|d TEFOD|d GZM|d UKAHL|d N$T|d YDX|d OCLCQ|d OCLCO|d NZAUC|d OCLCQ
049 |a MAIN
050 4|a QA276.45.R3
08204|a 004.2/2|2 23
1001 |a Luraschi, Javier,|e author.|9 429305
24510|a Mastering Spark with R :|b the complete guide to large-scale analysis and modeling /|c Javier Luraschi, Kevin Kuo, and Edgar Ruiz.
250 |a First edition.
264 1|a Sebastopol, CA :|b O'Reilly Media,|c [2019]
264 4|c ©2020
300 |a 1 online resource (xviii, 274 pages) :|b illustrations
336 |a text|b txt|2 rdacontent
337 |a computer|b c|2 rdamedia
338 |a online resource|b cr|2 rdacarrier
504 |a Includes bibliographical references and index.
5050 |a Intro; Copyright; Table of Contents; Foreword; Preface; Formatting; Acknowledgments; Conventions Used in This Book; Using Code Examples; O'Reilly Online Learning; How to Contact Us; Chapter 1. Introduction; Overview; Hadoop; Spark; R; sparklyr; Recap; Chapter 2. Getting Started; Overview; Prerequisites; Installing sparklyr; Installing Spark; Connecting; Using Spark; Web Interface; Analysis; Modeling; Data; Extensions; Distributed R; Streaming; Logs; Disconnecting; Using RStudio; Resources; Recap; Chapter 3. Analysis; Overview; Import; Wrangle; Built-in Functions; Correlations; Visualize
5058 |a Using ggplot2Using dbplot; Model; Caching; Communicate; Recap; Chapter 4. Modeling; Overview; Exploratory Data Analysis; Feature Engineering; Supervised Learning; Generalized Linear Regression; Other Models; Unsupervised Learning; Data Preparation; Topic Modeling; Recap; Chapter 5. Pipelines; Overview; Creation; Use Cases; Hyperparameter Tuning; Operating Modes; Interoperability; Deployment; Batch Scoring; Real-Time Scoring; Recap; Chapter 6. Clusters; Overview; On-Premises; Managers; Distributions; Cloud; Amazon; Databricks; Google; IBM; Microsoft; Qubole; Kubernetes; Tools; RStudio; Jupyter
5058 |a LivyRecap; Chapter 7. Connections; Overview; Edge Nodes; Spark Home; Local; Standalone; YARN; YARN Client; YARN Cluster; Livy; Mesos; Kubernetes; Cloud; Batches; Tools; Multiple Connections; Troubleshooting; Logging; Spark Submit; Windows; Recap; Chapter 8. Data; Overview; Reading Data; Paths; Schema; Memory; Columns; Writing Data; Copying Data; File Formats; CSV; JSON; Parquet; Others; File Systems; Storage Systems; Hive; Cassandra; JDBC; Recap; Chapter 9. Tuning; Overview; Graph; Timeline; Configuring; Connect Settings; Submit Settings; Runtime Settings; sparklyr Settings; Partitioning
5058 |a Implicit PartitionsExplicit Partitions; Caching; Checkpointing; Memory; Shuffling; Serialization; Configuration Files; Recap; Chapter 10. Extensions; Overview; H2O; Graphs; XGBoost; Deep Learning; Genomics; Spatial; Troubleshooting; Recap; Chapter 11. Distributed R; Overview; Use Cases; Custom Parsers; Partitioned Modeling; Grid Search; Web APIs; Simulations; Partitions; Grouping; Columns; Context; Functions; Packages; Cluster Requirements; Installing R; Apache Arrow; Troubleshooting; Worker Logs; Resolving Timeouts; Inspecting Partitions; Debugging Workers; Recap; Chapter 12. Streaming
5058 |a OverviewTransformations; Analysis; Modeling; Pipelines; Distributed R; Kafka; Shiny; Recap; Chapter 13. Contributing; Overview; The Spark API; Spark Extensions; Using Scala Code; Recap; Appendix A. Supplemental Code References; Preface; Formatting; Chapter 1; The World's Capacity to Store Information; Daily Downloads of CRAN Packages; Chapter 2; Prerequisites; Chapter 3; Hive Functions; Chapter 4; MLlib Functions; Chapter 6; Google Trends for On-Premises (Mainframes), Cloud Computing, and Kubernetes; Chapter 12; Stream Generator; Installing Kafka; Index; About the Authors; Colophon
520 |a "Authors Javier Luraschi, Kevin Kuo, and Edgar Ruiz show you how to combine R with Spark to analyze data at scale. This book covers relevant data science topics, cluster computing, and issues that will interest even the most advanced users."--Back cover
5880 |a Online resource; title from title page (Safari, viewed October 10, 2019).
590 |a O'Reilly|b O'Reilly Online Learning: Academic/Public Library Edition
63000|a Spark (Electronic resource : Apache Software Foundation)
63007|a Spark (Electronic resource : Apache Software Foundation)|2 fast|0 (OCoLC)fst01938143
650 0|a R (Computer program language)|9 74517
650 0|a Electronic data processing.|9 37046
650 0|a Big data.|9 403931
7001 |a Kuo, Kevin,|e author.|9 429306
7001 |a Ruiz, Edgar,|e author.|9 429307
77608|i Print version:|a Luraschi, Javier.|t Mastering Spark with R : The Complete Guide to Large-Scale Analysis and Modeling.|d Sebastopol : O'Reilly Media, Incorporated, ©2019|z 9781492046370
85640|u https://library.access.arlingtonva.us/login?url=https://learning.oreilly.com/library/view/~/9781492046363/?ar|x O'Reilly|z eBook
938 |a Askews and Holts Library Services|b ASKH|n AH36840625
938 |a ProQuest Ebook Central|b EBLB|n EBL5928213
938 |a EBSCOhost|b EBSC|n 2267502
938 |a YBP Library Services|b YANK|n 300879994
938 |a YBP Library Services|b YANK|n 16494849
994 |a 92|b VIA
999 |c 290085|d 290085