top of page
Header4_Cloud-Computing_1100x300.jpg

AWS Certified Data Analytics – Specialist

il corso progettato da First Consulting rappresenta un focus sulle soluzioni "Data Analytics" implementabili tramite i Cloud Services AWS con speciale riferimento a Design Pattern e Best Pratice Adottabili. 
Novità 2022 nella docenza viene prevista una trattazione introduttiva e preliminare dell' emergente "Shift Paradigm" denominato" Data Mesh" e sua possibile referenza nel ecosistema AWS dedicato alle "Analytics Data Architecture"

DURATA

min 10 gg Piano Formativo Personalizzabile

ATTESTATI

Attestato di Partecipazione First Consulting 

CERTIFICAZIONI

Propedeutico Path AWS

KEY POINT

Data Anaytics 
AWS Data Architecture 
Data Mesh 

Programma

Domain 1: Collection

Determine the operational characteristics of the collection system

  • Evaluate that the data loss is within tolerance limits in the event of failures

  • Evaluate costs associated with data acquisition, transfer, and provisioning from various sources into the collection system (e.g., networking, bandwidth, ETL/data migration costs)

  • Assess the failure scenarios that the collection system may undergo, and take remediation actions based on impact

  • Determine data persistence at various points of data capture

  • Identify the latency characteristics of the collection system

 

Select a collection system that handles the frequency, volume, and the source of data

  • Describe and characterize the volume and flow characteristics of incoming data (streaming, transactional, batch)

  • Match flow characteristics of data to potential solutions

  • Assess the tradeoffs between various ingestion services taking into account scalability, cost, fault tolerance, latency, etc.

  • Explain the throughput capability of a variety of different types of data collection and identify bottlenecks

  • Choose a collection solution that satisfies connectivity constraints of the source data system

 

Select a collection system that addresses the key properties of data, such as order, format, and compression

  • Describe how to capture data changes at the source

  • Discuss data structure and format, compression applied, and encryption requirements

  • Distinguish the impact of out-of-order delivery of data, duplicate delivery of data, and the tradeoffs between at-most-once, exactly-once, and at-least-once processing

  • Describe how to transform and filter data during the collection process

 

Domain 2: Storage and Data Management

Determine the operational characteristics of the storage solution for analytics

  • Determine the appropriate storage service(s) on the basis of cost vs. performance

  • Understand the durability, reliability, and latency characteristics of the storage solution based on requirements

  • Determine the requirements of a system for strong vs. eventual consistency of the storage system

  • Determine the appropriate storage solution to address data freshness requirements

 

 Determine data access and retrieval patterns

  • Determine the appropriate storage solution based on update patterns (e.g., bulk, transactional, micro batching)

  • Determine the appropriate storage solution based on access patterns (e.g., sequential vs. random access, continuous usage vs.ad hoc)

  • Determine the appropriate storage solution to address change characteristics of data (append-only changes vs. updates)

  • Determine the appropriate storage solution for long-term storage vs. transient storage

  • Determine the appropriate storage solution for structured vs. semi-structured data

  • Determine the appropriate storage solution to address query latency requirements

  • Select appropriate data layout, schema, structure, and format

  • Determine appropriate mechanisms to address schema evolution requirements

  • Select the storage format for the task

  • Select the compression/encoding strategies for the chosen storage format

  • Select the data sorting and distribution strategies and the storage layout for efficient data access

  • Explain the cost and performance implications of different data distributions, layouts, and formats (e.g., size and number of files)

  • Implement data formatting and partitioning schemes for data-optimized analysis

 

Define data lifecycle based on usage patterns and business requirements

  • Determine the strategy to address data lifecycle requirements

  • Apply the lifecycle and data retention policies to different storage solutions

 

Determine the appropriate system for cataloging data and managing metadata. Evaluate mechanisms for discovery of new and updated data sources

  • Evaluate mechanisms for creating and updating data catalogs and metadata

  • Explain mechanisms for searching and retrieving data catalogs and metadata

  • Explain mechanisms for tagging and classifying data

Domain 3: Processing

Determine appropriate data processing solution requirements

  • Understand data preparation and usage requirements

  • Understand different types of data sources and targets

  • Evaluate performance and orchestration needs

  • Evaluate appropriate services for cost, scalability, and availability

 

Design a solution for transforming and preparing data for analysis

  • Apply appropriate ETL/ELT techniques for batch and real-time workloads

  • Implement failover, scaling, and replication mechanisms

  • Implement techniques to address concurrency needs

  • Implement techniques to improve cost-optimization efficiencies

  • Apply orchestration workflows

  • Aggregate and enrich data for downstream consumption

 

Automate and operationalize data processing solutions

  • Implement automated techniques for repeatable workflows

  • Apply methods to identify and recover from processing failures

  • Deploy logging and monitoring solutions to enable auditing and traceability

Domain 4: Analysis and Visualization

Determine the operational characteristics of the analysis and visualization solution

  • Determine costs associated with analysis and visualization

  • Determine scalability associated with analysis

  • Determine failover recovery and fault tolerance within the RPO/RTO

  • Determine the availability characteristics of an analysis tool

  • Evaluate dynamic, interactive, and static presentations of data

  • Translate performance requirements to an appropriate visualization approach (pre-compute and consume static data vs. consume dynamic data

 

Select the appropriate data analysis solution for a given scenario

  • Evaluate and compare analysis solutions

  • Select the right type of analysis based on the customer use case (streaming, interactive, collaborative, operational)

Select the appropriate data visualization solution for a given scenario

  • Evaluate output capabilities for a given analysis solution (metrics, KPIs, tabular, API)

  • Choose the appropriate method for data delivery (e.g., web, mobile, email, collaborative notebooks)

  • Choose and define the appropriate data refresh schedule

  • Choose appropriate tools for different data freshness requirements (e.g., Amazon Elasticsearch Service vs. Amazon QuickSight vs. Amazon EMR notebooks)

  • Understand the capabilities of visualization tools for interactive use cases (e.g., drill down, drill through and pivot)

  • Implement the appropriate data access mechanism (e.g., in memory vs. direct access)

  • Implement an integrated solution from multiple heterogeneous data sources

 

Domain 5: Security

Select appropriate authentication and authorization mechanisms

  • Implement appropriate authentication methods (e.g., federated access, SSO, IAM)

  • Implement appropriate authorization methods (e.g., policies, ACL, table/column level permissions)

  • Implement appropriate access control mechanisms (e.g., security groups, role-based control)

 

Apply data protection and encryption techniques

  • Determine data encryption and masking needs

  • Apply different encryption approaches (server-side encryption, client-side encryption, AWS KMS, AWS CloudHSM)

  • Implement at-rest and in-transit encryption mechanisms

  • Implement data obfuscation and masking techniques

  • Apply basic principles of key rotation and secrets management

 

Apply data governance and compliance controls

  • Determine data governance and compliance requirements

  • Understand and configure access and audit logging across data analytics services

  • Implement appropriate controls to meet compliance requirements

first-consulting_services-it-sales-force_1920x1080.jpg

VUOI SAPERNE DI PIÙ?

Telefono

Email Segreteria 

Email Commerciale

bottom of page