Database Management Systems (DBMS) have become one of the fastest-growing fields, assisting cloud technology. Today, data has become a significantly crucial asset for all, and without DBMS, cloud computing cannot be implemented as it provides all the methods and techniques to manage huge amounts of data. Therefore understanding and mastering DBMS techniques is highly essential for anyone preparing for a related role in any organisation.
In this post, we will be looking for mostly asked DBMS interview questions (with answers), which are categorised on the basis of difficulty level. Followed by data security questions and real-life DBMS implementation interview questions.
Database Management System (DBMS) is used to manage, store, manipulate, retrieve and organise all the data present in any database. DBMS works as an intermediate between the users and the data hosted on any cloud platform or even offline databases. There are lots of advantages to using DBMS such as data sharing, data backup and recovery, controlled redundancy, independence and accessibility of data anywhere in the world, and data security, etc.
A JOIN keyword is often used to combine two or more database tables in order to find a relationship between them or to form a combined database. In SQL, there are four types of joins.
Feature | DBMS | RDBMS |
Definition | A software system for managing databases. | A type of DBMS based on the relational model. |
Data Storage | Stores data as files in hierarchical or network models. | Stores data in tables (relations). |
Normalisation | It may not support normalisation. | Supports normalisation to reduce redundancy. |
Query Language | May use various query languages. | Uses SQL (Structured Query Language) for querying. |
Data Integrity | Basic integrity constraints. | Enforces integrity constraints through primary and foreign keys. |
Examples | File systems, XML databases. | MySQL, PostgreSQL, Oracle, SQL Server. |
Multi-user Support | Limited multi-user capabilities. | Robust multi-user support with transaction control. |
Scalability | Generally less scalable for large data volumes. | Designed to handle large amounts of data and complex queries efficiently. |
Relation schema is a concept to find relationships between entities in the database with the help of ER diagrams, that can visually show us the true relation. This helps in quickly understanding the database and we can define the database structure more effectively and accurately.
In Object-Oriented Database Management System (OODBMS), the data is stored in objects which can represent the real-world entities more effectively. Hence, it becomes easy and precise to find the relationship between two datasets present as two objects. It has a great advantage rather than using traditional RDBMS.
Concurrency ensures the simultaneous execution of transactions in a multi-user database system. It does not conflict with each other and maintains data consistency and security through low transaction interaction. Overall, it helps to prevent data loss, loss of updates and packages, and errors in the database.
Deadlock refers to a condition where two or more transactions cannot proceed further because each one is waiting for a new resource, but it is locked by the other. Deadlock results in a non-proceeding cycle which lacks dependencies and prevents any running transactions from completion.
This is part of creating a high-level model of the database, which helps to form a structured approach towards data insertion and its organisation. It is independent of how the data is physically stored in the database, and using entities, attributes, relationships, and constraints results in the development of perfect logical and physical database schemas.
To manipulate, navigate and return any value from the database, in a specific manner or say row-by-row, a cursor object is used. Cursors are generally stored in triggers, scripts and procedures. It mainly helps in row-by-row handling of data, like complex calculations and conditional logic.
Aspect | Logical Database Design | Physical Database Design |
Focus On | Conceptual schema, entity-relationship model, data model | Storage structures, indexing, partitioning |
Abstraction Level | High-level, focuses on data semantics and relationships | Low-level, deals with storage details and optimization |
Impact | Changes in logical design may require rewriting queries | Changes in physical design may require storage reorganisation |
Example | Adding a new entity to the data model | Adding a new index to improve query performance |
Based on the data models and how data is stored and organised, we can categorise different DBMS such as:
a. Hierarchical DBMS:
b. Relational DBMS (RDBMS):
c. Network DBMS:
d. Object-Oriented DBMS (OODBMS):
e. OBject-relational DBMS (ORDBMS):
f. NoSQL DBMS:
g. Distributed DBMS:
GRANT statement provides some specific permissions to the users or any different person who has some role or access to database objects. It helps in assigning permissions such as SELECT, INSERT, DELETE, EXECUTE and UPDATE.
The REVOKE statement, on the other hand, removes the previously granted privileges from those users. This ensures that the user does not perform any specific operation on revoked objects.
Also read- SQL Interview Questions
Feature | COMMIT | ROLLBACK |
Operation | Saves changes permanently | Reverts changes |
Usage | Used when the transaction is successful, and changes should be saved | Used when a transaction fails or needs to be undone |
Effect | Changes are visible to all users | Changes are discarded, and the database returns to the previous state |
Control | Ends the current transaction | Ends the current transaction |
Data Integrity | Ensures data integrity by making changes permanent | Ensures data integrity by discarding partial changes |
Syntax Example | COMMIT; | ROLLBACK; |
A data dictionary acts as a centralised repository of metadata, which can have information about the database’s structure, constraints and relationships. It helps in maintaining metadata storage, data integrity, enhanced collaboration, query optimization and better decision-making.
Savepoints allow us to mark a point during a transaction so that we can roll back later on without performing the entire transaction from the start.
The process of discovering different patterns or trends from large datasets with the help of various methods and techniques like statistics, machine learning and database system decision-making is called data mining.
The Entity-Relationship (E-R model) describes the relationship between various entities. Hence, this graphical representation helps any database engineer to explain the structure in depth, representing complex relations through charts and making the visualisation easy to understand. It represents entities inside rectangles, attributes as ovals, and diamonds to show concurrent relationships.
Feature | WHERE Clause | HAVING Clause |
Usage | Filters rows before grouping | Filters grouped rows after grouping |
Applied To | Individual rows | Grouped rows |
Aggregation | Cannot use aggregate functions | Must use aggregate functions (e.g., SUM, AVG) |
Syntax | Used before the GROUP BY clause | Used after the GROUP BY clause |
Example | SELECT * FROM table WHERE condition; | SELECT column, SUM(value) FROM table GROUP BY column HAVING condition; |
This refers to adding new attributes or any specific property to an existing entity type. Hence, it can modify or expand any entity’s structure without changing its entire core definition.
Live Lock happens in concurrent systems when there are two or more processes that change their states continuously to respond to each other’s actions. This hinders the process and stalls in an effectively stuck position.
Aspect | Data Warehouse | Database |
Purpose | Stores historical data for analysis and reporting | Stores current, operational data for transaction processing |
Data Type | Typically stores structured and semi-structured data | Primarily stores structured data |
Data Volume | Handles large volumes of data (terabytes to petabytes) | Handles moderate to large volumes of data |
Data Latency | Supports batch processing with low real-time requirements | Supports real-time data processing and updates |
Data Sources | Integrates data from multiple sources and systems | Typically focuses on data from a single application or system |
Usage | Used for business intelligence, data mining, and decision-making | Used for managing day-to-day operations and transactions |
Examples | Amazon Redshift, Snowflake, Google BigQuery | MySQL, PostgreSQL, Oracle |
As the name suggests, those tables are temporarily stored in a database session to store immediate results, queries and any temporary data. They are particularly used to break down complex scenarios into simpler manageable parts.
Redundant Array of Independent Disks (RAID) technology is used to improve storage performance, accessibility, reliability and capacity with the help of multiple physical disk drivers combined into a single logical unit.
Feature | UNION | UNION ALL |
Duplicate Rows | Removes duplicate rows from the result set | Includes all rows, including duplicates |
Syntax | SELECT … UNION SELECT … | SELECT … UNION ALL SELECT … |
Performance | It may have a slightly higher overhead due to duplicate removal | Generally faster than UNION due to no duplicate removal |
Use Case | Use when you want to combine and deduplicate rows | Use when you want to combine all rows, including duplicates |
Integrity rules include entity integrity, i.e. each row in a table must have a unique primary key, referential integrity to check foreign key values must match primary key values in another table, user-defined integrity to apply custom rules specified by the user and domain integrity to ensure data values must be within a specified range.
A correlated subquery depends on the outer query for its values is executed once for each row and is processed by the outer query. Hence it makes it closely related to the outer query.
Feature | Database Schema | Database State |
Definition | A structure that defines the database objects | Snapshot of the database at a particular moment |
Purpose | Defines how data is organised and stored | Represents the actual data stored in the DB |
Changes Over Time | Changes infrequently define the database structure | Changes frequently as data is added or modified |
Examples | Table definitions, constraints, relationships | Actual rows and values in the tables |
QBE or Query By Example method ensures the representation of querying databases and their methods based on a simple graphical interface. Rather than using SQL statements, QBE can be filed with a grid or form with example data to specify the conditions.
Aspect | Data Warehouse | Database |
Purpose | Stores historical data for analysis and reporting | Stores current, operational data for transaction processing |
Data Type | Typically stores structured and semi-structured data | Primarily stores structured data |
Data Volume | Handles large volumes of data (terabytes to petabytes) | Handles moderate to large volumes of data |
Data Latency | Supports batch processing with low real-time requirements | Supports real-time data processing and updates |
Data Sources | Integrates data from multiple sources and systems | Typically focuses on data from a single application or system |
Usage | Used for business intelligence, data mining, and decision-making | Used for managing day-to-day operations and transactions |
Examples | Amazon Redshift, Snowflake, Google BigQuery | MySQL, PostgreSQL, Oracle |
These are pre-compiled SQL statements that already contain the basic structure of specific conditions and require only the fine-tuning of the statement. There are many advantages of stored procedures like:
Feature | Optimistic Locking | Pessimistic Locking |
Assumption | Conflicts are rare, and transactions are likely to succeed | Conflicts are common, and transactions may fail |
Locking Strategy | Does not acquire locks on resources during read operations | Acquires locks on resources during read operations |
Conflict Detection | Checks for conflicts before committing a transaction | Locks resources to prevent conflicts with other transactions |
Concurrency | Allows multiple transactions to access resources concurrently but detects conflicts before committing | Prevents concurrent access to resources by locking them |
Usage | Suitable for environments with low conflict rates | Suitable for environments with high conflict rates or where data integrity is critical |
A two-phase commit (2PC) works as a distributed algorithm to assist a transaction process via all participating databases to either commit or rollback. It ensures that all databases commit the transaction or none will commit, based on any unanimous agreement, ensuring data consistency across all distributed databases,
Aspect | Vertical Scaling | Horizontal Scaling |
Definition | Increasing the capacity of a single server (e.g., adding more CPU and RAM) | Adding more servers to distribute load (e.g., sharding) |
Limitations | Limited by the maximum capacity of a single server | Scalability is more flexible, limited by network and infrastructure |
Complexity | Relatively simpler, as it involves upgrading hardware | More complex, as it involves distributed systems and synchronisation |
Cost | It can be expensive, as high-end hardware is costly | It can be cost-effective, as commodity hardware can be used |
Implementation | Requires downtime for hardware upgrades | It can be implemented with minimal downtime using load balancers |
Examples | Adding more RAM to a server, upgrading the CPU | Adding more servers to a web application using a load balancer |
To serve as a primary key in a table, instead of using natural keys, we can use surrogate keys, which can act as a unique identifier. They are basically either auto-incremented integers or globally unique identifiers, used when natural keys are not suitable or too complex.
Aspect | Horizontal Partitioning | Sharding |
Definition | Divides a table into multiple smaller tables (partitions) based on row ranges or hash values | Distributes rows of a table across multiple servers (shards) based on a shard key |
Scalability | Improves read and write performance by distributing data and query load across multiple partitions | Improves read and write performance by distributing data and query load across multiple shards |
Complexity | Relatively simpler, as it involves dividing a single table into smaller parts within the same database | More complex, as it involves distributing data across multiple servers and managing shard keys |
Fault Tolerance | Limited fault tolerance, as a failure in one partition, can impact the entire table | Higher fault tolerance, as a failure in one shard, affects only the data stored in that shard |
Examples | Partitioning a customer table based on the customer ID range | Sharding a user table across multiple servers based on the user’s geographic location |
Pattern matching is done using the LIKE operator in SQL in addition to some wildcard characters. For example, ‘%’ represents zero or more characters, and ‘_’ represents a single character.
WHERE column_name LIKE ‘a%
WHERE column_name LIKE ‘_r%
Pagination is performed to break down a large set of data into smaller, manageable parts called pages. It is used in web development and database queries to improve the performance and usability of applications and reduce the time to load database elements on screen. Pagination can be implemented using LIMIT and OFFSET clauses.
Example:
SELECT * FROM table_name
ORDER BY column_name
LIMIT page_size OFFSET offset_value;
ORM tool works as a bridge between OOP and Relational databases, allowing them to work together abstracting away the complexities of database interactions and also implementing object-oriented concepts.
Aspect | Normalisation | Denormalization |
Definition | Process of organising data to reduce redundancy and improve data integrity | Process of intentionally introducing redundancy to improve query performance |
Aim | Reduce data redundancy and dependency | Improve query performance by reducing the number of joins |
Complexity | Increases complexity due to multiple tables and relationships | Simplifies queries but may lead to data anomalies and update anomalies |
Data Integrity | Improves data integrity by minimising redundancy | May compromise data integrity due to redundant data |
Query Performance | It may require more joins and have slower query performance | Typically has faster query performance due to fewer joins |
Storage Space | It may require more storage space due to normalised tables | It may require less storage space due to redundant data |
Example | Customer and Order tables in a normalised database | Combining Customer and Order data into a single table for reporting purposes |
Backup and recovery tools are highly crucial in protecting any database against database security issues, SQL injections or any other data threat. To minimise data loss and fast hardware failure recovery, backups are very essential. Without it, it is impossible to restore data to a previous state, and it becomes too difficult to handle or work with the corrupt data.
There are many threats to any database system either present on a cloud network or hardware data. Some of the threats are:
Caching improves the performance and computation power of any database model by storing frequently accessed data in memory, and the workflow becomes faster. Hence, it reduces the time to reload all the basic things by not accessing the main disk for repeated tasks while completing a query.
Here, you can mention the following tasks to improve database performance.
These are used to monitor database activities to make sure of whether all the tasks are performed under regulations or not by the privileged users. Hence it reduces the amount of threat or human error to a great extent. Audits include certain regulations, internal policies, best practices for security and backup plans.
Any slow-running SQL query can be enhanced by initially indexing out the rows on which operation is required. Next, the limit and result set can be used to limit out-of-domain answers. Avoid using the SELECT keyword and select only important columns. Use INNER JOIN instead of OUTER JOIN where possible to avoid joining large tables and also avoid functions inside the WHERE clause. Temporary tables also help in reducing complexity. Other than this, you can use EXISTS or IN instead of DISTINCT clause.
You must identify the deadlock first, then follow procedures to kill a process hindering the whole sequence. After that you must try to perform the transaction again, followed by a review and more optimised queries. Deadlock detection and prevention mechanism tools can also help in identifying and preventing deadlock situations.
Here, you can describe any project with data warehousing concepts, like a retail company, to analyse sales data. Defining schemas you created for storing consumer details, admin details and other user details. You can mention the usage of ETL (extraction, transform, load) processes to populate the data warehouse from various sources and optimization of inventory management.
For this, you must look for the existing database backup, review it and test the backup. If there are no backups, then the first task will be creating a regular backup schema. After that check the database health and all the privileges to monitor activity. Updating statistics and reviewing security are also important for routine maintenance. Logs are very crucial in errors and backup recovery.
This generally depends on how big the database is and what type of data is going to be stored inside it. There are some commonly used databases which have easy-to-adopt interfaces, and all the general-purpose techniques. Some of the databases are Oracle, Microsoft SQL Server, MySQL, PostgreSQL, etc.
In conclusion, database management systems and their best practices will help you handle complex queries and scenarios. Successfully implementing DBMS techniques and looking for regular updates and new inventions also assist you in keeping up with the tools that you are working on. Hope these DBMS interview questions will give you an idea about what types of questions are asked in interviews.
Book a free counselling session
Get a personalized career roadmap
Get tailored program recommendations
Explore industry trends and job opportunities
Programs tailored for your Success
Popular
Data Science
Technology
Finance
Management
Future Tech
© 2024 Hero Vired. All rights reserved