The biggest strength of Teradata and Netezza data warehouse appliance (RDBMS) is parallel processing. Just like Netezza architecture, Teradata architecture is based on Massively Parallel Processing (MPP) architecture. The Teradata is made up of Parsing Engine, BYNET and Access Module Processors (AMPs) and other components such as nodes.
Teradata is inexpensive, high-quality system that exceeded the performance of conventional relational database management systems.
Read:
- Commonly used Teradata Date Functions and Examples
- Teradata Analytics Functions and Examples
- Teradata Set Operators: UNION, UNION ALL, INTERSECT, EXCEPT/MINUS
- Teradata Architecture – Components of Teradata
- How Teradata Data Distribution Works on AMPs?
Teradata Architecture Diagram
The following diagram shows the high level architecture of a Teradata Node.
Teradata Architecture – Components
Following are the key components of the Teradata:
Parsing Engine (PE)
The Parsing Engine (PE) is responsible for receiving the SQL queries from the client. Whenever you connect to Teradata, you actually connecting to PE.
Parsing Engine (PE) creates execution plan for the submitted queries. The other responsibilities of the PE are, receive SQL query from client, check for the syntax errors in query, check the user privileges to get data out of tables, pass the efficient execution plan to BYNET, receive results from AMP and send it back to client application.
PE uses the statistics such as number of AMP connected to Teradata system, number of rows in the tables to create efficient execution plan.
Access Module Processor (AMP)
Access module processors are virtual processors (vprocs) and these processors actually stores and retrieve the data.
AMP receive the data and execution plan from Parsing Engine (PE). It perform the tasks such as filtering, aggregation, grouping etc on the data and stores the result back to its associated disks. Each AMP is associated with set of disks and only that AMP can access those disks. Based on the primary index column or first column, data is evenly distributed across the AMP’s. Teradata will use the first column for data distribution across all AMPs by creating non-unique primary index on it.
Related reading:
BYNET (Message Passing Layer)
Middle layer, the message passing layer is called BYNET. This layer is also a communication layer between AMP’s, nodes. It receive the execution plan from PE and send to appropriate AMP. It also receive the processed data from AMP and sends to PE.
Just to maintain high availability, there are two BYNET’s (BYNET 0 and BYNET 1) in Teradata systems. The other BYNET takes over in case of primary BYNET failure.
Nodes
Each individual server in the Teradata is referred to as a node. Each node has its own operating system, CPU, memory, own copy of Teradata RDBMS software and disk space.