Google Professional Data Engineer Architecture and Operations

Use only for cross-cutting design, migration, monitoring, reliability, troubleshooting, and cost or performance choices that do not fit a specific data service better.

Exams
PROFESSIONAL-DATA-ENGINEER
Questions
2
Comments
43

1. PROFESSIONAL-DATA-ENGINEER Topic 1 Question 90

Sequence
183
Discussion ID
17260
Source URL
https://www.examtopics.com/discussions/google/view/17260-exam-professional-data-engineer-topic-1-question-90/
Posted By
-
Posted At
March 22, 2020, 5:59 p.m.

Question

You are deploying MariaDB SQL databases on GCE VM Instances and need to configure monitoring and alerting. You want to collect metrics including network connections, disk IO and replication status from MariaDB with minimal development effort and use StackDriver for dashboards and alerts.
What should you do?

  • A. Install the OpenCensus Agent and create a custom metric collection application with a StackDriver exporter.
  • B. Place the MariaDB instances in an Instance Group with a Health Check.
  • C. Install the StackDriver Logging Agent and configure fluentd in_tail plugin to read MariaDB logs.
  • D. Install the StackDriver Agent and configure the MySQL plugin.

Suggested Answer

D

Answer Description Click to expand


Community Answer Votes

Comments 20 comments Click to expand

Comment 1

ID: 76116 User: Barniyah Badges: Highly Voted Relative Date: 4 years, 10 months ago Absolute Date: Sun 18 Apr 2021 17:58 Selected Answer: - Upvotes: 26

Answer : A
MariaDB needs costume metrics , and stackdriver built-in monitoring tools will not provide these metrics . Opencensus Agent will do this for you
For more info , refer to :
https://cloud.google.com/monitoring/custom-metrics/open-census

Comment 1.1

ID: 444675 User: fire558787 Badges: - Relative Date: 3 years, 5 months ago Absolute Date: Wed 14 Sep 2022 17:49 Selected Answer: - Upvotes: 13

It is definitely A.
B: can't be because Health Checks just checks that machine is online
C: StackDriver Logging is for Logging. Here we talk of Monitoring / Alerting
D: StackDriver Agent monitors default metrics of VMs and some Database stuff with the MySQL Plugin. Here you want to monitor some more custom stuff like Replication of MariaDB (I didn't find anything of this sort in the plugin page), and you may want to use Custom Metrics rather than default metrics. "Cloud Monitoring automatically collects more than 1,500 built-in metrics from more than 100 monitored resources. But those metrics cannot capture application-specific data or client-side system data. Those metrics can give you information on backend latency or disk usage, but they can't tell you how many background routines your application spawned." https://cloud.google.com/monitoring/custom-metrics/open-census#monitoring_opencensus_metrics_quickstart-python

Comment 2

ID: 68751 User: [Removed] Badges: Highly Voted Relative Date: 4 years, 11 months ago Absolute Date: Sun 28 Mar 2021 04:44 Selected Answer: - Upvotes: 13

Answer: C
Description: The GitHub repository named google-fluentd-catch-all-config which includes the configuration files for the Logging agent for ingesting the logs from various third-party software packages.

Comment 2.1

ID: 428197 User: Atulthakur Badges: - Relative Date: 3 years, 6 months ago Absolute Date: Sat 20 Aug 2022 16:33 Selected Answer: - Upvotes: 2

I think its D, because its Selfmanaged DB and for this we use Stackdriver Agents. and in this question its asking about metrics not logs.

Comment 3

ID: 1087643 User: rocky48 Badges: Most Recent Relative Date: 1 year, 3 months ago Absolute Date: Wed 04 Dec 2024 13:52 Selected Answer: D Upvotes: 2

Here's the rationale:
StackDriver Agent: The StackDriver Agent is designed to collect system and application metrics from virtual machine instances and send them to StackDriver Monitoring. It simplifies the process of collecting and forwarding metrics.
MySQL Plugin: The StackDriver Agent has a MySQL plugin that allows you to collect MySQL-specific metrics without the need for additional custom development. This includes metrics related to network connections, disk IO, and replication status – which are the specific metrics you mentioned.

Option D is the most straightforward and least development-intensive approach to achieve the monitoring and alerting requirements for MariaDB on GCE VM Instances using StackDriver.

Comment 4

ID: 1065975 User: BlehMaks Badges: - Relative Date: 1 year, 4 months ago Absolute Date: Fri 08 Nov 2024 22:33 Selected Answer: A Upvotes: 1

replication status seems to be not included in sql agent metrics. but I do not like A in terms of efforts

Comment 5

ID: 831614 User: ninjatech Badges: - Relative Date: 2 years ago Absolute Date: Thu 07 Mar 2024 07:51 Selected Answer: - Upvotes: 1

it can't be A as it saying minimal development but for opencensus the development is needed.

Comment 6

ID: 750942 User: slade_wilson Badges: - Relative Date: 2 years, 2 months ago Absolute Date: Wed 20 Dec 2023 14:41 Selected Answer: A Upvotes: 1

To use metrics collected by OpenCensus in your Google Cloud project, you must make the OpenCensus metrics libraries and the Stackdriver exporter available to your application. The Stackdriver exporter exports the metrics that OpenCensus collects to your Google Cloud project. You can then use Cloud Monitoring to chart or monitor those metrics.

Comment 7

ID: 724225 User: dish11dish Badges: - Relative Date: 2 years, 3 months ago Absolute Date: Wed 22 Nov 2023 10:45 Selected Answer: D Upvotes: 5

Option D is Correct
MariaDB is a community-developed, commercially supported fork of the MySQL relational database management system (RDBMS). To collect logs and metrics for MariaDB, use the mysql receivers.

The mysql receiver connects by default to a local MariaDB server using a Unix socket and Unix authentication as the root user.

reference:-https://cloud.google.com/stackdriver/docs/solutions/agents/ops-agent/third-party/mariadb

Comment 8

ID: 707186 User: girgu Badges: - Relative Date: 2 years, 4 months ago Absolute Date: Sun 29 Oct 2023 15:22 Selected Answer: D Upvotes: 3

https://cloud.google.com/monitoring/agent/ops-agent/third-party/mariadb

Comment 9

ID: 676605 User: clouditis Badges: - Relative Date: 2 years, 5 months ago Absolute Date: Sat 23 Sep 2023 01:10 Selected Answer: - Upvotes: 2

C is the answer, fluentd plug in is needed as the DB is on GCE

Comment 10

ID: 653670 User: ducc Badges: - Relative Date: 2 years, 6 months ago Absolute Date: Wed 30 Aug 2023 00:28 Selected Answer: D Upvotes: 2

go for D

Comment 11

ID: 649712 User: eRaymox Badges: - Relative Date: 2 years, 6 months ago Absolute Date: Mon 21 Aug 2023 12:34 Selected Answer: - Upvotes: 2

A
StackDriver Agent monitors default metrics of VMs and some Database stuff with the MySQL Plugin. Here you want to monitor some more custom stuff like Replication of MariaDB (I didn’t find anything of this sort in the plugin page), and you may want to use Custom Metrics rather than default metrics. “Cloud Monitoring automatically collects more than 1,500 built-in metrics from more than 100 monitored resources. But those metrics cannot capture application-specific data or client-side system data. Those metrics can give you information on backend latency or disk usage, but they can’t tell you how many background routines your application spawned.” https://cloud.google.com/monitoring/custom-metrics/open-census#monitoring_opencensus_metrics_quickstart-python

Comment 12

ID: 620517 User: Kriegs Badges: - Relative Date: 2 years, 8 months ago Absolute Date: Thu 22 Jun 2023 18:52 Selected Answer: - Upvotes: 2

I'm not 100% sure as I have no experience with that issue, but I would say it's D - both A and D should work, but the keyword is "with minimal development effort" (and using pre-built plugin > creating custom metric in terms of simplicity, that's obvious) and all of the relevant data (as per question) should be there: https://cloud.google.com/monitoring/api/metrics_agent#agent-mysql

I'm not sure if C would work, but it also seems more advanced in implementation than D. B is 100% wrong and insufficient for that use case.

Feel free to prove me wrong :)

Comment 13

ID: 590529 User: NR22 Badges: - Relative Date: 2 years, 10 months ago Absolute Date: Sun 23 Apr 2023 12:04 Selected Answer: - Upvotes: 1

A and D both seem like viable options here, unsure which is Google's preferred method as that would be deemed the correct answer in the exam. Any opinions?

Comment 14

ID: 586364 User: Didine_22 Badges: - Relative Date: 2 years, 11 months ago Absolute Date: Sat 15 Apr 2023 15:27 Selected Answer: D Upvotes: 5

D
mariaDB is an extension of mysql and mysql plugin must work fine to extract the metrics of mariaDB.

Comment 14.1

ID: 607091 User: ST42 Badges: - Relative Date: 2 years, 9 months ago Absolute Date: Thu 25 May 2023 08:55 Selected Answer: - Upvotes: 2

"MariaDB is a community-developed, commercially supported fork of the MySQL relational database management system (RDBMS). To collect logs and metrics for MariaDB, use the mysql receivers."

https://cloud.google.com/monitoring/agent/ops-agent/third-party/mariadb

Comment 15

ID: 530814 User: rbeeraka Badges: - Relative Date: 3 years, 1 month ago Absolute Date: Mon 23 Jan 2023 21:59 Selected Answer: A Upvotes: 2

Opencensus Agent is right one

Comment 16

ID: 445579 User: nguyenmoon Badges: - Relative Date: 3 years, 5 months ago Absolute Date: Fri 16 Sep 2022 04:24 Selected Answer: - Upvotes: 4

D is correct. Answer : D
mariaDB is an extension of mysql and mysql plugin must work fine to extract the metrics of mariaDB.

Comment 17

ID: 422975 User: safiyu Badges: - Relative Date: 3 years, 7 months ago Absolute Date: Wed 10 Aug 2022 22:04 Selected Answer: - Upvotes: 6

Answer : D
mariaDB is an extension of mysql and mysql plugin must work fine to extract the metrics of mariaDB.
https://cloud.google.com/monitoring/agent/plugins/mysql

2. PROFESSIONAL-DATA-ENGINEER Topic 1 Question 189

Sequence
271
Discussion ID
79608
Source URL
https://www.examtopics.com/discussions/google/view/79608-exam-professional-data-engineer-topic-1-question-189/
Posted By
AWSandeep
Posted At
Sept. 2, 2022, 11:13 p.m.

Question

You need to migrate 1 PB of data from an on-premises data center to Google Cloud. Data transfer time during the migration should take only a few hours. You want to follow Google-recommended practices to facilitate the large data transfer over a secure connection. What should you do?

  • A. Establish a Cloud Interconnect connection between the on-premises data center and Google Cloud, and then use the Storage Transfer Service.
  • B. Use a Transfer Appliance and have engineers manually encrypt, decrypt, and verify the data.
  • C. Establish a Cloud VPN connection, start gcloud compute scp jobs in parallel, and run checksums to verify the data.
  • D. Reduce the data into 3 TB batches, transfer the data using gsutil, and run checksums to verify the data.

Suggested Answer

A

Answer Description Click to expand


Community Answer Votes

Comments 23 comments Click to expand

Comment 1

ID: 687264 User: devaid Badges: Highly Voted Relative Date: 3 years, 5 months ago Absolute Date: Thu 06 Oct 2022 00:06 Selected Answer: A Upvotes: 5

Well it doesn't mentions anything about not enough bandwidth to meet your project deadline. I guess you can assume they have 200GBps+ of bandwith, otherwise it shouldn't take only a few hours.

Comment 2

ID: 1260427 User: iooj Badges: Most Recent Relative Date: 1 year, 7 months ago Absolute Date: Sat 03 Aug 2024 21:50 Selected Answer: - Upvotes: 3

One who wanted to use Transfer Appliance to migrate data in a few hours, you should live near Google office and run really fast :D

Comment 3

ID: 1102380 User: MaxNRG Badges: - Relative Date: 2 years, 2 months ago Absolute Date: Thu 21 Dec 2023 12:09 Selected Answer: A Upvotes: 2

Cloud Interconnect provides a dedicated private connection between on-prem and Google Cloud for high bandwidth (up to 100 Gbps) and low latency. This facilitates large, fast data transfers.
Storage Transfer Service supports parallel data transfers over Cloud Interconnect. It can transfer petabyte-scale datasets faster by transferring objects in parallel.
Storage Transfer Service uses HTTPS encryption in transit and at rest by default for secure data transfers.
It follows Google-recommended practices for large data migrations vs ad hoc methods like gsutil or scp.
The other options would take too long for a 1 PB transfer (VPN capped at 3 Gbps, manual transfers) or introduce extra steps like batching and checksums. Cloud Interconnect + Storage Transfer is the recommended Google solution.

Comment 4

ID: 1056584 User: LanaOjisan Badges: - Relative Date: 2 years, 4 months ago Absolute Date: Sun 29 Oct 2023 05:52 Selected Answer: - Upvotes: 2

It is believed that A.
One reason is that for "secure" and "in a few hours," the communication can be done securely using a direct physical line without going through an ISP. Also, depending on the case, in the case of "Dedicated Interconnect," the maximum transfer can be as high as 200 Gbps, and the fastest data transfer of 1 PB can be completed in 11 hours.
Therefore, A.

Comment 5

ID: 986614 User: arien_chen Badges: - Relative Date: 2 years, 6 months ago Absolute Date: Mon 21 Aug 2023 16:33 Selected Answer: A Upvotes: 3

A
https://cloud.google.com/storage-transfer/docs/transfer-options#:~:text=Transferring%20more%20than%201%20TB%20from%20on%2Dpremises

Comment 6

ID: 965216 User: knith66 Badges: - Relative Date: 2 years, 7 months ago Absolute Date: Fri 28 Jul 2023 04:08 Selected Answer: A Upvotes: 1

Dedicated Interconnect provides direct physical connections between your on-premises network and Google's network. Dedicated Interconnect enables you to transfer large amounts of data between networks, which can be more cost-effective than purchasing additional bandwidth over the public internet. https://cloud.google.com/network-connectivity/docs/interconnect/concepts/dedicated-overview

Comment 6.1

ID: 965217 User: knith66 Badges: - Relative Date: 2 years, 7 months ago Absolute Date: Fri 28 Jul 2023 04:10 Selected Answer: - Upvotes: 1

This link has additional clarity
https://cloud.google.com/network-connectivity/docs/interconnect/concepts/terminology

Comment 7

ID: 941626 User: vaga1 Badges: - Relative Date: 2 years, 8 months ago Absolute Date: Mon 03 Jul 2023 10:24 Selected Answer: B Upvotes: 3

1 PB and "few hours". It is clearly referring to Transfer Appliance

https://cloud.google.com/architecture/migration-to-google-cloud-transferring-your-large-datasets#time

Comment 7.1

ID: 965215 User: knith66 Badges: - Relative Date: 2 years, 7 months ago Absolute Date: Fri 28 Jul 2023 04:06 Selected Answer: - Upvotes: 3

Transfer Appliance is a slow process. wont be able to do in few hours

Comment 8

ID: 885438 User: Oleksandr0501 Badges: - Relative Date: 2 years, 10 months ago Absolute Date: Sun 30 Apr 2023 18:30 Selected Answer: A Upvotes: 1

gpt: Based on security and speed, if the data is highly sensitive and security is the top priority, then option B (using a Transfer Appliance) may be a better choice. Transfer Appliance uses hardware encryption to transfer data and is designed to securely transfer large amounts of data. However, if speed is the primary concern, then option A (using Cloud Interconnect and Storage Transfer Service) may be a better choice as it allows for faster transfer speeds over a dedicated and secure connection. It ultimately depends on the specific needs and priorities of the organization.

A vague treaky question. Bad author of it...
B is also good. As were said in discuss. by smb, a question asks "safe connection", so - a Cloud Interconnect (A)

Comment 9

ID: 845552 User: midgoo Badges: - Relative Date: 2 years, 11 months ago Absolute Date: Tue 21 Mar 2023 06:37 Selected Answer: B Upvotes: 3

Either this question is very tricky or very poor written. It says 'Data transfer time during the migration should take only a few hours'. We should not add the 20days for overhead time for Appliance into the total time of migration.

If 'a few hours' = 30hours or more, A will be good enough.
If 'a few hours' = 10 or less, B is the only way (with multiple devices to copy at the same time)

Comment 9.1

ID: 1064432 User: spicebits Badges: - Relative Date: 2 years, 4 months ago Absolute Date: Tue 07 Nov 2023 03:19 Selected Answer: - Upvotes: 1

B can't be the answer - You have to wait 25 days to receive the appliance and another 25 days to get the appliance back and data loaded to cloud storage: https://cloud.google.com/transfer-appliance/docs/4.0/overview#transfer-speeds

Comment 10

ID: 843554 User: Nandhu95 Badges: - Relative Date: 2 years, 11 months ago Absolute Date: Sun 19 Mar 2023 09:31 Selected Answer: A Upvotes: 1

Expected time via transfer appliance is around 20 days , and achieving the same using Storage transfer service with highest bandwidth of 100GPS is 30 hrs, so hence its been asked for hrs .. its A
Acquiring a Transfer Appliance is straightforward. In the Google Cloud console, you request a Transfer Appliance, indicate how much data you have, and then Google ships one or more appliances to your requested location. You're given a number of days to transfer your data to the appliance ("data capture") and ship it back to Google.

The expected turnaround time for a network appliance to be shipped, loaded with your data, shipped back, and rehydrated on Google Cloud is 20 days. If your online transfer timeframe is calculated to be substantially more than this timeframe, consider Transfer Appliance. The total cost for the 300 TB device process is less than $2,500.

Comment 10.1

ID: 941625 User: vaga1 Badges: - Relative Date: 2 years, 8 months ago Absolute Date: Mon 03 Jul 2023 10:22 Selected Answer: - Upvotes: 1

it says data transfer during the migration. It mean from when the migration is "activated", which means from when the Transfer Appliace device is plugged and ready to be used

Comment 11

ID: 833826 User: wjtb Badges: - Relative Date: 3 years ago Absolute Date: Thu 09 Mar 2023 11:10 Selected Answer: B Upvotes: 2

Even with 100gbps bandwith, you will not reach a data transfer time within the range of "hours" for 1PB. Transfer appliance is the way to go. https://cloud.google.com/architecture/migration-to-google-cloud-transferring-your-large-datasets#time

Comment 12

ID: 814095 User: musumusu Badges: - Relative Date: 3 years ago Absolute Date: Sun 19 Feb 2023 14:08 Selected Answer: - Upvotes: 1

Answer A,
One time transfer is cheaper and less secure always using Transfer Appliance.
you need to do it in faster way, set up Interconnect speed limit is 50mbps - 10GBps
and Transfer Appliance speed can goes up to 40GBps.
I am choosing A for security concern only.

Comment 13

ID: 763412 User: AzureDP900 Badges: - Relative Date: 3 years, 2 months ago Absolute Date: Mon 02 Jan 2023 01:37 Selected Answer: - Upvotes: 1

A is right

Comment 13.1

ID: 763413 User: AzureDP900 Badges: - Relative Date: 3 years, 2 months ago Absolute Date: Mon 02 Jan 2023 01:38 Selected Answer: - Upvotes: 1

A. Establish a Cloud Interconnect connection between the on-premises data center and Google Cloud, and then use the Storage Transfer Service. Most Voted

Comment 14

ID: 727118 User: Atnafu Badges: - Relative Date: 3 years, 3 months ago Absolute Date: Fri 25 Nov 2022 23:30 Selected Answer: - Upvotes: 2

B
https://cloud.google.com/architecture/migration-to-google-cloud-transferring-your-large-datasets#:~:text=Few%20things%20in,not%20be%20obtained.

Comment 14.1

ID: 727125 User: Atnafu Badges: - Relative Date: 3 years, 3 months ago Absolute Date: Fri 25 Nov 2022 23:58 Selected Answer: - Upvotes: 4

B
It takes 30hrs with 100Gbps bandwidth- more than a day to transfer
https://cloud.google.com/architecture/migration-to-google-cloud-transferring-your-large-datasets#:~:text=addresses%20or%20NATs.-,Online%20versus%20offline%20transfer,A%20certain%20amount%20of%20management%20overhead%20is%20built%20into%20these%20calculations.,-As%20noted%20earlier

Comment 15

ID: 667743 User: pluiedust Badges: - Relative Date: 3 years, 6 months ago Absolute Date: Tue 13 Sep 2022 09:11 Selected Answer: A Upvotes: 1

A is correct.

Comment 16

ID: 666520 User: bigquery1102 Badges: - Relative Date: 3 years, 6 months ago Absolute Date: Mon 12 Sep 2022 02:25 Selected Answer: A Upvotes: 1

A is correct
https://cloud.google.com/architecture/migration-to-google-cloud-transferring-your-large-datasets#transfer_appliance_for_larger_transfers

Comment 17

ID: 666307 User: MounicaN Badges: - Relative Date: 3 years, 6 months ago Absolute Date: Sun 11 Sep 2022 19:16 Selected Answer: A Upvotes: 1

Huge data can be migrated over cloud interconnect