Modules
1 Digital Transformation with Google Cloud
- 'why ' we do what we do .
- Cloud types and tips
- data cloud - manage data across the entire data lifecycle, with AI built in.
- open infrastructure cloud - innovate and scale from on premises, to edge to cloud.
- open standard - standard (specifications)
- open source - everyone can see the source code
- collaboration cloud - teams, google workspace
- trusted cloud - with security tools
- CapEx to OpEx - > Capital expenses (own servers ON PREMISES, ) . OpEx (only pay what you use CLOUD)
- data sovereignty - right to be forgotten
- data residency - data must stay in the place where it was created
NETWORKS
Internet Protocols -
#️⃣ 🔤 📒
IP Address =====> Domain Name =====> DNS (Domain Name System ) (phone book)
35.24.123.7====>www.myblog.com ====> www.myblog.com = 35.24.123.7
/\ /\
bandwidth - how much data can be transferred in a certain amount of time. theoretical maximum.
throughput - actual how much data can be transferred in a certain amount of time. real world network limitations.
latency - how long does it take for a request to reach end point
Network's Edge - entry point to the network . More edges than zones
[[Cloud VPNs]] [[Cloud Interconnect]] method to connect networks together
2 Exploring data transformation with Google cloud
types of data
no analysis -> day to day operations.
Structured data - SQL (relational databases) - Tables (rows and columns), well defined schema. [Cloud SQL], [Cloud Spanner]
Semi-structured Data - NoSQL (non-relational databases) - diverse data types, not a tabular format . emails, messages . [Firestore] [Cloud Bigtable] .
Unstructured Data - Images, video, files, mp3 , backups. [[Cloud Storage]] -> standard🔥, nearline(30), coldline(90), archive(365). Autoclass - transitions objects automatically to appropriate storage classes if you haven't looked at them in a while.
types of data storage places
**databases** - SQL [[Cloud SQL]] / [[Cloud Spanner]] , NoSQL [[Firestore]] / [[Cloud Bigtable]]
**data warehouses** - analyze trends (reports), market analysis. [BigQuery]
**data lakes** - have all content in one place. A repository to ingest, store, explore, process and analyze any type or volume of raw data. All storage products.
MIGRATION
- lift and shift - don’t change anything in it . Use managed services [[Datastream]] - upload data continuously for migration doesn’t , [[DMS Database Migration service ]].
- [[Looker]] - to visualize our data. Business intelligence platform to understand our data. 📈 is like [[BigQuery]] you can connect any sql database source to looker
ANALYTICS IN DATA
batch processing - payroll
Streaming data - inventory - needs to stay up to date [Dataflow] , [PubSub]
[[Etl]] extract it , transform it, load it. [[data pipeline]] process.
[[Apache Beam]] - open source programming model for [[ data pipeline]] design. You can use[[Dataflow]] in Google for a serverless fully managed experience and not have to do everything in [[Apache Beam]].
3 innovating with AI
AI
ML is a subset of AI . 🤖
How ai and ml (forward looking , future data) differ from data analytics / business intelligence (historic backward looking data)
Completeness- all data required
Uniqueness - no dupes
Timeliness- is data old or fresh
Validity- formatting
Accuracy ✅❌ - true or not
Consistency 🧑🏼✈️ - data is uniform, and not contradictory (marine called differently in different data sources by name in some and social security number in others)
——-
Googles AI principles for safety
- testable and tested
- Privacy
- Accountable to people
- For uses that support the principles
- Bias reinforcing
- Socially beneficial
- high standards of scientific excellence
Google will not design or deploy
- no surveillance
- no crimes
- No weapons
- Causes harm
If you need to give it some personal info - create a storage bucket and tell paid Gemini to only use that info and not feed it into the internet .
CLOUD PRODUCTS FOR AI AND ML
- [[BIGQUERY ML ]] - you have your own data, train your own model. SQL , predict (future) integrates with [[Vertex AI]] , platform to deploy the model registry for an endpoint, custom built model as an app .
- [[Pre-trained APIs gOOGLE AI ]] - > if you DON'T have your own training data or a data scientist.
- [[AutoML]] - Takes your data and trains the pre trained models -> you load your data and it chooses the best [[Machine learning model]] for you . Riding on the shoulders of giants.
- Custom training -> [[Vertex AI]] - suite of products to help each stage of the ml workflow -> Gather data, feature training , building models , deploying and monitoring models. [[TensorFlow]] - training and inference of neural networks, created by Google. for researchers to innovate .
==============
Security operations (SecOps). Practice that is all about protecting your organization's data and systems in the cloud. It involves a combination of processes and technologies that help reduce the risk of data breaches, system outages, and other security incidents.
Site reliability engineering (SRE) - ensures the reliability, availability, and efficiency of software systems and services deployed in the cloud.
Zero trust security - a strategic framework that establishes strict access controls based on the principle of continuous verification. Security operations focuses on the practical, day-to-day implementation of security measures, like threat detection, incident response, and monitoring.
Cloud security posture management (CSPM) - specifically focuses on identifying and correcting misconfigurations or vulnerabilities within your cloud infrastructure to maintain a strong security posture in the cloud.
==============
Cloud Profiler tool- It identifies how much CPU power, memory, and other resources an application uses . It's designed to analyze application code and pinpoint areas where resources (CPU, memory) are inefficiently used, contributing to performance bottlenecks.
Cloud Monitoring - big-picture health of infrastructure and services . comprehensive view of your cloud infrastructure and applications.
4 Modernize infrastructure and applications with google cloud
Microservices - services communicate through APIs . and its more when we are talking about architecture and design . modern cloud app development
Monolith - opposite of microservices, everything is tightly coupled and can't scale independently.
Container - app + dependencies
[[Virtual Machine]] - container + Operating System
[[Kubernetes Engine (Google)]] - Orchestrates containers . - managing infrastructure, complex dependencies between infrastructure.
Serverless - you just provide the code. Google does everything else.
Serverless Computing Products
[[App Engine]] - build and deploy web applications (containerized)
[[Cloud Run]] fully managed environment for running containerized apps that can handle multiple events at the same time. / [[Cloud Functions]] simple, single purpose event-driven functions.
potential drawbacks to rehosting on prem legacy services to the cloud
rehosting legacy apps.
[[Google Cloud VMWare Engine ]] - migrate existing VMWare workloads
[[Bare Metal Solution]] - for ORACLE workloads.
5 Trust and security with google cloud
[[Apigee Edge]] - manage APIs
5 security ===================
Privileged access - grants certain users a broader access than most users
Least privilege - only access needed
Zero trust - assumes nothing and no one can be trusted by default
Security by default - security from the start
Security posture - overall security status of a cloud environment
Cyber resilience - an organization’s ability to withstand and recover quickly from cyber attacks.
Firewall - network device that regulates traffic based on security rules
Encryption - converting data to unreadable by using an encryption algorithm
Decryption - uses an encryption 🔑 to restore encrypted data to original form
CIA - confidentiality , integrity, availability
3 As of cloud identity management - Authentication, authorization, auditing
============
(dAY 2, MINUTES INTO VIDEO)
Network safety
5:01 -> 5:06 =
5:17-> 5:23 =
5:21 -> 5:27 =
Compliance - create organization policy constraints + IAM access to store data in the right region 🗾 (5:43 day 2)
6 Scaling with Google cloud operations
MODULE 6
6:01 - scaling with GOogle cloud - >
4 🌟golden signals measure performance and reliability - ⏰ latency check, 🚗 traffic , saturation, ❌ errors
High availability- remain operational even if software or hardware issues occur.
Key Design Principles
Redundancy - duplicate critical components
Replication - several copies across the different regions.
Regions, scalable infrastructure, backups
Observability Tools
[[Google Cloud Monitoring]] - metrics , NUMBERS . how latent. how many hits, how many users logged in . SRE team .
[[Google Cloud Logging]] - details - somebody hit your endpoint, here is their IP address, this is the request, this is the response .
[[Google Cloud trace]] - app visibility - slow or not - LATENCY
[[Google Cloud profiler]] - app visibility - MEMORY USAGE
[[Google Cloud Error Reporting]] - app visibility - CRASHES , ERRORS AND HOW OFTEN
Levels of Support
Basic - free
Standard Support
Enhanced
Premium support
=========
Notes from Test prep
Dataflow is used to transform and process data after it is received, not to ingest it.
Pub/Sub is a messaging service that can receive data from device streams such as sensors, at the start of a data pipeline.
Cloud Billing reports offer a reactive method to help you track and understand what you’ve already spent on Google Cloud resources and provide ways to help optimize your costs
[[Bare Metal Solution]] - for Oracle workloads
[[Dataproc]]- is a managed service for large-scale data processing using Apache Hadoop and Spark. While relevant for data preparation for AI, it's not focused on the model development itself.
Methods to connect networks
[[Cloud VPNs]]
[[Cloud Interconnect]]
[[Cloud Run]] - runs containerized web applications