Scalable Data Architecture with Java : Build Efficient Enterprise-Grade Data Architecting Solutions Using Java

Book Cover
Average Rating
Published
Birmingham : Packt Publishing, Limited, 2022.
Status
Available Online

Description

Orchestrate data architecting solutions using Java and related technologies to evaluate, recommend and present the most suitable solution to leadership and clients Key Features Learn how to adapt to the ever-evolving data architecture technology landscape Understand how to choose the best suited technology, platform, and architecture to realize effective business value Implement effective data security and governance principles Book Description Java architectural patterns and tools help architects to build reliable, scalable, and secure data engineering solutions that collect, manipulate, and publish data. This book will help you make the most of the architecting data solutions available with clear and actionable advice from an expert. You'll start with an overview of data architecture, exploring responsibilities of a Java data architect, and learning about various data formats, data storage, databases, and data application platforms as well as how to choose them. Next, you'll understand how to architect a batch and real-time data processing pipeline. You'll also get to grips with the various Java data processing patterns, before progressing to data security and governance. The later chapters will show you how to publish Data as a Service and how you can architect it. Finally, you'll focus on how to evaluate and recommend an architecture by developing performance benchmarks, estimations, and various decision metrics. By the end of this book, you'll be able to successfully orchestrate data architecture solutions using Java and related technologies as well as to evaluate and present the most suitable solution to your clients. What you will learn Analyze and use the best data architecture patterns for problems Understand when and how to choose Java tools for a data architecture Build batch and real-time data engineering solutions using Java Discover how to apply security and governance to a solution Measure performance, publish benchmarks, and optimize solutions Evaluate, choose, and present the best architectural alternatives Understand how to publish Data as a Service using GraphQL and a REST API Who this book is for Data architects, aspiring data architects, Java developers and anyone who wants to develop or optimize scalable data architecture solutions using Java will find this book useful. A basic understanding of data architecture and Java programming is required to get the best from this book.

More Details

Format
Language
English
ISBN
9781801072083, 1801072086

Notes

General Note
Configuring and running the application
Description
Orchestrate data architecting solutions using Java and related technologies to evaluate, recommend and present the most suitable solution to leadership and clients Key Features Learn how to adapt to the ever-evolving data architecture technology landscape Understand how to choose the best suited technology, platform, and architecture to realize effective business value Implement effective data security and governance principles Book Description Java architectural patterns and tools help architects to build reliable, scalable, and secure data engineering solutions that collect, manipulate, and publish data. This book will help you make the most of the architecting data solutions available with clear and actionable advice from an expert. You'll start with an overview of data architecture, exploring responsibilities of a Java data architect, and learning about various data formats, data storage, databases, and data application platforms as well as how to choose them. Next, you'll understand how to architect a batch and real-time data processing pipeline. You'll also get to grips with the various Java data processing patterns, before progressing to data security and governance. The later chapters will show you how to publish Data as a Service and how you can architect it. Finally, you'll focus on how to evaluate and recommend an architecture by developing performance benchmarks, estimations, and various decision metrics. By the end of this book, you'll be able to successfully orchestrate data architecture solutions using Java and related technologies as well as to evaluate and present the most suitable solution to your clients. What you will learn Analyze and use the best data architecture patterns for problems Understand when and how to choose Java tools for a data architecture Build batch and real-time data engineering solutions using Java Discover how to apply security and governance to a solution Measure performance, publish benchmarks, and optimize solutions Evaluate, choose, and present the best architectural alternatives Understand how to publish Data as a Service using GraphQL and a REST API Who this book is for Data architects, aspiring data architects, Java developers and anyone who wants to develop or optimize scalable data architecture solutions using Java will find this book useful. A basic understanding of data architecture and Java programming is required to get the best from this book.
Local note
O'Reilly O'Reilly Online Learning: Academic/Public Library Edition

Table of Contents

Cover
Title Page
Copyright and Credits
Contributors
About the reviewers
Table of Contents
Preface
Section 1
Foundation of Data Systems
Chapter 1: Basics of Modern Data Architecture
Exploring the landscape of data engineering
What is data engineering?
Dimensions of data
Types of data engineering problems
Responsibilities and challenges of a Java data architect
Data architect versus data engineer
Challenges of a data architect
Techniques to mitigate those challenges
Summary
Chapter 2: Data Storage and Databases
Understanding data types, formats, and encodings
Data types
Data formats
Understanding file, block, and object storage
File storage
Block storage
Object storage
The data lake, data warehouse, and data mart
Data lake
Data warehouse
Data marts
Databases and their types
Relational database
NoSQL database
Data model design considerations
Summary
Chapter 3: Identifying the Right Data Platform
Technical requirements
Virtualization and containerization platforms
Benefits of virtualization
Containerization
Benefits of containerization
Kubernetes
Hadoop platforms
Hadoop architecture
Cloud platforms
Benefits of cloud computing
Choosing the correct platform
When to choose virtualization versus containerization
When to use big data
Choosing between on-premise versus cloud-based solutions
Choosing between various cloud vendors
Summary
Section 2
Building Data Processing Pipelines
Chapter 4: ETL Data Load
A Batch-Based Solution to Ingesting Data in a Data Warehouse
Technical requirements
Understanding the problem and source data
Problem statement
Understanding the source data
Building an effective data model
Relational data warehouse schemas
Evaluation of the schema design
Designing the solution
Implementing and unit testing the solution
Summary
Chapter 5: Architecting a Batch Processing Pipeline
Technical requirements
Developing the architecture and choosing the right tools
Problem statement
Analyzing the problem
Architecting the solution
Factors that affect your choice of storage
Determining storage based on cost
The cost factor in the processing layer
Implementing the solution
Profiling the source data
Writing the Spark application
Deploying and running the Spark application
Developing and testing a Lambda trigger
Performance tuning a Spark job
Querying the ODL using AWS Athena
Summary
Chapter 6: Architecting a Real-Time Processing Pipeline
Technical requirements
Understanding and analyzing the streaming problem
Problem statement
Analyzing the problem
Architecting the solution
Implementing and verifying the design
Setting up Apache Kafka on your local machine
Developing the Kafka streaming application
Unit testing a Kafka Streams application

Discover More

Reviews from GoodReads

Loading GoodReads Reviews.

Citations

APA Citation, 7th Edition (style guide)

Banerjee, S. (2022). Scalable Data Architecture with Java: Build Efficient Enterprise-Grade Data Architecting Solutions Using Java . Packt Publishing, Limited.

Chicago / Turabian - Author Date Citation, 17th Edition (style guide)

Banerjee, Sinchan. 2022. Scalable Data Architecture With Java: Build Efficient Enterprise-Grade Data Architecting Solutions Using Java. Birmingham: Packt Publishing, Limited.

Chicago / Turabian - Humanities (Notes and Bibliography) Citation, 17th Edition (style guide)

Banerjee, Sinchan. Scalable Data Architecture With Java: Build Efficient Enterprise-Grade Data Architecting Solutions Using Java Birmingham: Packt Publishing, Limited, 2022.

Harvard Citation (style guide)

Banerjee, S. (2022). Scalable data architecture with java: build efficient enterprise-grade data architecting solutions using java. Birmingham: Packt Publishing, Limited.

MLA Citation, 9th Edition (style guide)

Banerjee, Sinchan. Scalable Data Architecture With Java: Build Efficient Enterprise-Grade Data Architecting Solutions Using Java Packt Publishing, Limited, 2022.

Note! Citations contain only title, author, edition, publisher, and year published. Citations should be used as a guideline and should be double checked for accuracy. Citation formats are based on standards as of August 2021.

Staff View

Grouped Work ID
bb5c6b44-8454-9024-bef0-4d369565859d-eng
Go To Grouped Work View in Staff Client

Grouping Information

Grouped Work IDbb5c6b44-8454-9024-bef0-4d369565859d-eng
Full titlescalable data architecture with java build efficient enterprise grade data architecting solutions using java
Authorbanerjee sinchan
Grouping Categorybook
Last Update2025-01-24 12:33:29PM
Last Indexed2025-05-22 03:35:33AM

Book Cover Information

Image Sourcegoogle_isbn
First LoadedJul 12, 2023
Last UsedMay 3, 2025

Marc Record

First DetectedMar 20, 2023 10:19:38 AM
Last File Modification TimeDec 17, 2024 08:21:42 AM
SuppressedRecord had no items

MARC Record

LEADER07414cam a22005417i 4500
001on1346366712
003OCoLC
00520241217082023.0
006m     o  d        
007cr cnu---unuuu
008221001s2022    enk     o     000 0 eng d
015 |a GBC2G8371|2 bnb
0167 |a 020752151|2 Uk
019 |a 1346533951
020 |a 9781801072083|q electronic book
020 |a 1801072086|q electronic book
035 |a (OCoLC)1346366712|z (OCoLC)1346533951
037 |a 9781801073080|b O'Reilly Media
037 |a 10163375|b IEEE
040 |a EBLCP|b eng|e rda|c EBLCP|d ORMDA|d EBLCP|d N$T|d YDX|d OCLCF|d UKAHL|d UKMGB|d OCLCQ|d IEEEE|d OCLCO
049 |a MAIN
050 4|a QA76.758|b .B36 2022
08204|a 005.1|2 23/eng/20221004
1001 |a Banerjee, Sinchan.
24510|a Scalable Data Architecture with Java :|b Build Efficient Enterprise-Grade Data Architecting Solutions Using Java /|c Sinchan Banerjee.
264 1|a Birmingham :|b Packt Publishing, Limited,|c 2022.
300 |a 1 online resource (382 p.)
336 |a text|b txt|2 rdacontent
337 |a computer|b c|2 rdamedia
338 |a online resource|b cr|2 rdacarrier
500 |a Configuring and running the application
5050 |a Cover -- Title Page -- Copyright and Credits -- Contributors -- About the reviewers -- Table of Contents -- Preface -- Section 1 -- Foundation of Data Systems -- Chapter 1: Basics of Modern Data Architecture -- Exploring the landscape of data engineering -- What is data engineering? -- Dimensions of data -- Types of data engineering problems -- Responsibilities and challenges of a Java data architect -- Data architect versus data engineer -- Challenges of a data architect -- Techniques to mitigate those challenges -- Summary -- Chapter 2: Data Storage and Databases
5058 |a Understanding data types, formats, and encodings -- Data types -- Data formats -- Understanding file, block, and object storage -- File storage -- Block storage -- Object storage -- The data lake, data warehouse, and data mart -- Data lake -- Data warehouse -- Data marts -- Databases and their types -- Relational database -- NoSQL database -- Data model design considerations -- Summary -- Chapter 3: Identifying the Right Data Platform -- Technical requirements -- Virtualization and containerization platforms -- Benefits of virtualization -- Containerization -- Benefits of containerization
5058 |a Kubernetes -- Hadoop platforms -- Hadoop architecture -- Cloud platforms -- Benefits of cloud computing -- Choosing the correct platform -- When to choose virtualization versus containerization -- When to use big data -- Choosing between on-premise versus cloud-based solutions -- Choosing between various cloud vendors -- Summary -- Section 2 -- Building Data Processing Pipelines -- Chapter 4: ETL Data Load -- A Batch-Based Solution to Ingesting Data in a Data Warehouse -- Technical requirements -- Understanding the problem and source data -- Problem statement -- Understanding the source data
5058 |a Building an effective data model -- Relational data warehouse schemas -- Evaluation of the schema design -- Designing the solution -- Implementing and unit testing the solution -- Summary -- Chapter 5: Architecting a Batch Processing Pipeline -- Technical requirements -- Developing the architecture and choosing the right tools -- Problem statement -- Analyzing the problem -- Architecting the solution -- Factors that affect your choice of storage -- Determining storage based on cost -- The cost factor in the processing layer -- Implementing the solution -- Profiling the source data
5058 |a Writing the Spark application -- Deploying and running the Spark application -- Developing and testing a Lambda trigger -- Performance tuning a Spark job -- Querying the ODL using AWS Athena -- Summary -- Chapter 6: Architecting a Real-Time Processing Pipeline -- Technical requirements -- Understanding and analyzing the streaming problem -- Problem statement -- Analyzing the problem -- Architecting the solution -- Implementing and verifying the design -- Setting up Apache Kafka on your local machine -- Developing the Kafka streaming application -- Unit testing a Kafka Streams application
520 |a Orchestrate data architecting solutions using Java and related technologies to evaluate, recommend and present the most suitable solution to leadership and clients Key Features Learn how to adapt to the ever-evolving data architecture technology landscape Understand how to choose the best suited technology, platform, and architecture to realize effective business value Implement effective data security and governance principles Book Description Java architectural patterns and tools help architects to build reliable, scalable, and secure data engineering solutions that collect, manipulate, and publish data. This book will help you make the most of the architecting data solutions available with clear and actionable advice from an expert. You'll start with an overview of data architecture, exploring responsibilities of a Java data architect, and learning about various data formats, data storage, databases, and data application platforms as well as how to choose them. Next, you'll understand how to architect a batch and real-time data processing pipeline. You'll also get to grips with the various Java data processing patterns, before progressing to data security and governance. The later chapters will show you how to publish Data as a Service and how you can architect it. Finally, you'll focus on how to evaluate and recommend an architecture by developing performance benchmarks, estimations, and various decision metrics. By the end of this book, you'll be able to successfully orchestrate data architecture solutions using Java and related technologies as well as to evaluate and present the most suitable solution to your clients. What you will learn Analyze and use the best data architecture patterns for problems Understand when and how to choose Java tools for a data architecture Build batch and real-time data engineering solutions using Java Discover how to apply security and governance to a solution Measure performance, publish benchmarks, and optimize solutions Evaluate, choose, and present the best architectural alternatives Understand how to publish Data as a Service using GraphQL and a REST API Who this book is for Data architects, aspiring data architects, Java developers and anyone who wants to develop or optimize scalable data architecture solutions using Java will find this book useful. A basic understanding of data architecture and Java programming is required to get the best from this book.
590 |a O'Reilly|b O'Reilly Online Learning: Academic/Public Library Edition
650 0|a Software architecture.|9 75529
650 0|a Java (Computer program language)|9 70616
77608|i Print version:|a Banerjee, Sinchan|t Scalable Data Architecture with Java|d Birmingham : Packt Publishing, Limited,c2022
85640|u https://library.access.arlingtonva.us/login?url=https://learning.oreilly.com/library/view/~/9781801073080/?ar|x O'Reilly|z eBook
938 |a Askews and Holts Library Services|b ASKH|n AH40382641
938 |a ProQuest Ebook Central|b EBLB|n EBL7101538
938 |a EBSCOhost|b EBSC|n 3399304
938 |a YBP Library Services|b YANK|n 18150789
994 |a 92|b VIA
999 |c 284249|d 284249