Monday, July 28, 2008

40. Final Nav

Final Navigation

1. Clk the Target folder in Navigator.

2. Tools----W. Designer (don’t delete previous targets)

3. Targets---Create (manually)

Or

Drag and drop source folder/target folder in Nav.

Or

Import from DB/DWH.

4. Clk the Mappings folder

5. Tools---M.Designer(don’t delete old mappings)

6. Mappings---Create and Name it

7. Drag and drop sources.(if not available import them in S.Analyzer)

8. Drag and drop targets.

9.Create Transformation and Enter Cols.

Or

Drag and drop cols to transformation from sources or targets.

Or (optional)

Drag and drop transformation from transformation folder.

10. Link all sources, transformations and targets.

11. Validate and save.

---------------------------------------------------------------------------------------------------------------------------------

39. DWH Nav Steps

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Repository Manager:

1.Rt clk on Repository(veerurepo)---2.Connect---3.Enter Username---4.Enter Password--5.Connect.

1.Clk Folder menu---2.Create----3.Enter name(Vfolder)---4.Ok---5.Close window.

Designer:

1.Rt clk on Repo(veerurepo)---2.Connect---3.UN&Password--4.Connect--5.Clk folder (Vfolder)—6.Clk source folder----

Tools---Source Analyzer--1.Sources---2.Import from Database.---3.Enter ODBC dsn------ 4.Username( in caps SCOTT)---5.Enter owner name (in caps SCOTT)---6.Enter Password (tiger)---7.Connect.—8.select tables---9.ok.

Note: If DSN is not created the procedure is as follows.

1.Clk on Ellipse button.---2.System DSN---3.Add---4.Select oracle in OraDB10g-home1(driver)--5.Finish---6.Enter kodad DSN in Data Source Name---7.Select TNS service name (server)---8.User Id (scott)---9.Test connection---10.Password(tiger)---11.Ok---

12.Ok(Oracle ODBC Driver Configuration window)---13.Ok(ODBC data source Adm window)--14.Select ODBC data source(kodad DSN)---15.User name (SCOTT)---16.Owner name (SCOTT)---17.Password(tiger)---18.Connect.---19.select tables---20.ok.

Note: UN must be in Caps.

Tools---Warehouse Designer---1.Clk Target folder------2.Targets---3.Create---4.Enter table name(vtable)---5.select the database---6.create---7.done---8.Dbl clk on Target table--9.Columns tab---10.Clk on icon---11.Enter the fields---12.Apply---13.Ok—14.Targets---15.Generate/ExecuteSQL---16.Enable selected Tables---17.Create table---18.Primary key19.Connect----20. Enter ODBC dsn----21.User name---22.Password ----23.connect-----24.Generate and Execute button---25.close---26.Message box ---27.OK.

Tools---Transformation Developer----1.Create---2.select Transformation type----3.Enter name--4.Create---5.Done---6.Dbl Clk Transformation----7.Clk on icon to enter the fields---8.Do Editings----9.Validate---10.ok---11.Apply---12.ok.

Tools--Mapping Designer---1.Goto navigator---2.Explore source folder---3.Drag & drop source tables—4.Explore target folder---5.Drag & drop targrt tables---6.Explore Transformation—7. Drag and drop Transformation----8.Link Source,Transformation,Target---9.Mappings----10.Validate.

------------------------------------------------------------------------------------------------------------

WORKFLOW MANAGER:

1.From left panel----2.select Task folder----3.Tools---4.Task Developer---5.Tasks---6.create---7.select mapping---8.ok---9.done---

10.connections---11.Relational---12.new---13.Select DB (oracle)---14.ok---

15.enter name---16.UN---17.Pwd---18.connect string---19.code page---20.ok---

21.Repeat from 12 to 20 steps.---22.close window---23.Dbl clk on task(EDIT)---

24.mapping tab---25.select sources from left panel---26.clk down arrow in value column in TYPE row---27.select veerusource in Relational connection Browser---28.ok---

29.select target from left panel---30. clk down arrow in value column in TYPE row .---31.select veerutarget in Relational connection Browser and click normal ---32.ok---33.Apply---34.ok---35.Tools---36.Workflow Designer---37.workflows---38.create---39.enter name for workflow(veeruworkflow)---40.select server and clk Ok---41.clk session from left panel---42.Drag and drop veerutask to right panel---43.Establish link---44.Save---

45.workflow--- start worlflow---select * from Target Table(vtable) for OUTPUT.

In DATABASE

For Source/Metadata Database

Connect sys as sysdba;

Create user veerus identified by veerus;

Grant connect, resource to veerus;

Connect veerus/veerus;

Select * from tab;

For Target Database

Connect sys as sysdba;

Create user veerut identified by veerut;

Grant connect, resource to veerut;

Connect veerut/veerut;

Select * from tab;

To copy tables

Connect scott/tiger;

Grant select on emp to veerus;

To paste tables into our sources

Connect veerus/veerus;

Create table emp as (select * from scott.emp);

Select * from tab;

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Precautions

1.In source Analyzer---------- Remove all the FKs.

2.In Warehouse Designer---- Enable selected tables, create table, Pk, connect,

Generate and Execute, close.

3.In Workflow Manager----- In create workflow window after naming select server.

Thursday, July 17, 2008

38. ETL Concepts

Source: http://www.learnbi.com/informatica9.htm

Formatting Columns of Numeric Data:

  • 1. Open the Microsoft Excel and select the columns that consist of numeric data.
  • 2. Choose Format Cells.
  • 3. In the Number tab, select Number.
  • 4. Specify the decimal place and click OK.
  • 5. Click File save.
Step IV. Importing Excel source definition:

  • 1. In the Informatica Designer connect to the respective repository and open folder for the source definition.
  • 2. Open the source analyzer and select Source from Import from Database.
  • 3. Import table opens up and select the Microsoft data source which is the Microsoft excel worksheet name in the ODBC Data source name.
  • 4. Click Browse and open up the ODBC administrator.
  • 5. In the user or system DSN tabs, double click the Microsoft excel driver.
  • 6. Click select workbook and browse for the Microsoft excel file, which is considered as the relational database.
  • 7. Click OK and connect.
  • 8. It is not required to enter the data base user name or the password. The ranges defined in the Microsoft excel appears as the username.
  • 9. There is no Owner name as we are not using the database.
  • 10. Select the table we wanted to import and click Ok.
  • 11. The newly created source definition would be there in the navigator under the data base name. Choose Repository Save.


37. Diagram of ETL

source: http://www.learndatamodeling.com/etl.htm

Figure 1.12 : Sample ETL Process Flow

36. ETL Concepts

Source: http://www.learndatamodeling.com/etl.htm

Source System
A database, application, file, or other storage facility from which the data in a data warehouse is derived.

Mapping
The definition of the relationship and data flow between source and target objects.

Metadata
Data that describes data and other structures, such as objects, business rules, and processes. For example, the schema design of a data warehouse is typically stored in a repository as metadata, which is used to generate scripts used to build and populate the data warehouse. A repository contains metadata.

Staging Area
A place where data is processed before entering the warehouse.

Cleansing
The process of resolving inconsistencies and fixing the anomalies in source data, typically as part of the ETL process.

Transformation
The process of manipulating data. Any manipulation beyond copying is a transformation. Examples include cleansing, aggregating, and integrating data from multiple sources.

Wednesday, July 16, 2008

35. INFORMATICA FAQ'S

SOURCE: 1. http://www.allinterview.com/showanswers/69154.html
SOURCE: 2. http://deviinformatica.blogspot.com/2007/09/informatica-interview-questions.html

Re: Differnce between filter and router?
Answer
# 1
first the Similarity between Router and Filter is

-Both transformation would able to discard data based on
some filter condition.

The Differences are...
1.Using router transformation one would able to test
incoming data for multiple filter condition and if none of
the condition met the incoming rows gets reouted to DAFAULT
group.

2.Using filter transformation one would able to pass data
to next transformation based on single filter pipeline
condition.There is no option of routing data that doesnt
match the condition like ROUTER Transformation.

Monday, July 14, 2008

34. PL/SQL SYLLABUS

techonthenet.com

Oracle is a relational database technology.

PLSQL stands for "Procedural Language extensions to SQL", and can be used in Oracle databases. PLSQL is closely integrated into the SQL language, yet it adds programming constructs that are not native to SQL.

We've categorized Oracle and PLSQL into the following topics:

Data Types SELECT Statement
Literals (Constants) DISTINCT
Declaring Variables COUNT / SUM / MIN / MAX
Is Null / Is Not Null

WHERE Clause
Loops and Conditional Statements "AND" Condition
Sequences (Autonumber) "OR" Condition
Transactions Combining "AND" with "OR"
Cursors
Functions (Built-In) (By Category) "LIKE" Condition
Functions (Built-In) (Alphabetical) "IN" Function
Oracle System Tables BETWEEN Condition

EXISTS Condition
Primary Keys GROUP BY
Foreign Keys HAVING
Unique Constraints ORDER BY (sort by)
Check Constraints
Indexes JOINS (inner, outer)

Subqueries
Creating Functions
Creating Procedures UNION Query
Creating Triggers UNION ALL Query
Exception Handling INTERSECT Query
Oracle Error Messages MINUS Query


Grant/Revoke Privileges UPDATE Statement
Roles (set of privileges) INSERT Statement
Change Password DELETE Statement


Synonyms (create, drop) Tables (create, alter, drop, temp)

Views

33. SQL Syllabus on net

http://download.oracle.com/docs/cd/B10501_01/server.920/a90842/ch13.htm#1012319

Shortcuts to Items in the List

% A B C D E F G H I J K L M N O P Q R S
S is currently selected
T U V W X
SA SC SD SE SG SH SI SK SL SO SP SQ ST SU SW SY

Statements, Keywords, Packages, and Functions

SAVE: Definition

SAVE EXCEPTIONS: Definition

SAVE Function: Definition

32. SQL TUTORIAL


http://www.sql-tutorial.net
SQL Tutorial
SQL Database Table
SQL SELECT
SQL SELECT INTO
SQL DISTINCT
SQL WHERE
SQL LIKE
SQL INSERT INTO
SQL UPDATE
SQL DELETE
SQL ORDER BY
SQL OR & AND
SQL IN
SQL BETWEEN
SQL Aliases
SQL COUNT
SQL MAX
SQL MIN
SQL AVG
SQL SUM
SQL GROUP BY
SQL HAVING
SQL JOIN
SQL Training
SQL Server
SQL Hosting

Sunday, July 13, 2008

31. SQL QUERIES

TIZAG.COM

SQL - Subqueries

MySQL offers a very limited support for subqueries, however Oracle and DB2 fully support them. Subqueries are Select queries placed within an existing SQL statement. They may exist in any of the following types of SQL statements.

  • Select
  • Insert
  • Update
  • Delete
  • Set
  • Do

Subqueries are great for answering very specific questions regarding your data inside your database. For instance, as the employer you may notice employee number 101 had a great day yesterday with sales. Just given this information we can use a subquery to pull the employee lastname and first name from our database.

SQL Code:

SELECT * FROM employees
WHERE id =
(SELECT EmployeeID FROM invoices WHERE EmployeeID='1');

SQL Table:

idLastnameFirstnameTitle
11DavisJulieMANAGER

Here we have pulled our employee information from the employees table by only knowing the employee number from the invoices table.

SQL - Subquery Inserts

Subqueries can be used to pull old data from your database and insert it into new tables. For instance if we opened up a third store and we wanted to place the same manager over 3 stores we could do this by pulling the manager's information using a subquery and then inserting the records. Also note that this form of insert will insert all cases where the subquery is true, therefore several rows may or may not be inserted depending upon how your table is set up.

SQL Code:

INSERT INTO employees3
(id,Lastname,Firstname,Title)
(SELECT id,Lastname,Firstname,Title
FROM employees WHERE Title='manager');

With complete mastery of a subqueries you can now see the power of the SQL language. The language is capable of nearly all things imaginable.

30. CREATE VIEW

A view is a virtual table based on the result-set of a SELECT statement.


What is a View?

In SQL, a VIEW is a virtual table based on the result-set of a SELECT statement.

A view contains rows and columns, just like a real table. The fields in a view are fields from one or more real tables in the database. You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the data were coming from a single table.

Note: The database design and structure will NOT be affected by the functions, where, or join statements in a view.

Syntax

CREATE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE condition

Note: The database does not store the view data! The database engine recreates the data, using the view's SELECT statement, every time a user queries a view.


Using Views

A view could be used from inside a query, a stored procedure, or from inside another view. By adding functions, joins, etc., to a view, it allows you to present exactly the data you want to the user.

The sample database Northwind has some views installed by default. The view "Current Product List" lists all active products (products that are not discontinued) from the Products table. The view is created with the following SQL:

CREATE VIEW [Current Product List] AS
SELECT ProductID,ProductName
FROM Products
WHERE Discontinued=No

We can query the view above as follows:

SELECT * FROM [Current Product List]

Another view from the Northwind sample database selects every product in the Products table that has a unit price that is higher than the average unit price:

CREATE VIEW [Products Above Average Price] AS
SELECT ProductName,UnitPrice
FROM Products
WHERE UnitPrice>(SELECT AVG(UnitPrice) FROM Products)

We can query the view above as follows:

SELECT * FROM [Products Above Average Price]

Another example view from the Northwind database calculates the total sale for each category in 1997. Note that this view selects its data from another view called "Product Sales for 1997":

CREATE VIEW [Category Sales For 1997] AS
SELECT DISTINCT CategoryName,Sum(ProductSales) AS CategorySales
FROM [Product Sales for 1997]
GROUP BY CategoryName

We can query the view above as follows:

SELECT * FROM [Category Sales For 1997]

We can also add a condition to the query. Now we want to see the total sale only for the category "Beverages":

SELECT * FROM [Category Sales For 1997]
WHERE CategoryName='Beverages'

29. SELECT INTO

The SELECT INTO Statement

The SELECT INTO statement is most often used to create backup copies of tables or for archiving records.

Syntax

SELECT column_name(s) INTO newtable [IN externaldatabase]
FROM source


Make a Backup Copy

The following example makes a backup copy of the "Persons" table:

SELECT * INTO Persons_backup
FROM Persons

The IN clause can be used to copy tables into another database:

SELECT Persons.* INTO Persons IN 'Backup.mdb'
FROM Persons

If you only want to copy a few fields, you can do so by listing them after the SELECT statement:

SELECT LastName,FirstName INTO Persons_backup
FROM Persons

You can also add a WHERE clause. The following example creates a "Persons_backup" table with two columns (FirstName and LastName) by extracting the persons who lives in "Sandnes" from the "Persons" table:

SELECT LastName,Firstname INTO Persons_backup
FROM Persons
WHERE City='Sandnes'

Selecting data from more than one table is also possible. The following example creates a new table "Empl_Ord_backup" that contains data from the two tables Employees and Orders:

SELECT Employees.Name,Orders.Product
INTO Empl_Ord_backup
FROM Employees
INNER JOIN Orders
ON Employees.Employee_ID=Orders.Employee_ID

28. GROUP BY

GROUP BY...

GROUP BY... was added to SQL because aggregate functions (like SUM) return the aggregate of all column values every time they are called, and without the GROUP BY function it was impossible to find the sum for each individual group of column values.

The syntax for the GROUP BY function is:

SELECT column,SUM(column) FROM table GROUP BY column


GROUP BY Example

This "Sales" Table:

Company Amount
W3Schools 5500
IBM 4500
W3Schools 7100

And This SQL:

SELECT Company, SUM(Amount) FROM Sales

Returns this result:

Company SUM(Amount)
W3Schools 17100
IBM 17100
W3Schools 17100

The above code is invalid because the column returned is not part of an aggregate. A GROUP BY clause will solve this problem:

SELECT Company,SUM(Amount) FROM Sales
GROUP BY Company

Returns this result:

Company SUM(Amount)
W3Schools 12600
IBM 4500


HAVING...

HAVING... was added to SQL because the WHERE keyword could not be used against aggregate functions (like SUM), and without HAVING... it would be impossible to test for result conditions.

The syntax for the HAVING function is:

SELECT column,SUM(column) FROM table
GROUP BY column
HAVING SUM(column) condition value

This "Sales" Table:

Company Amount
W3Schools 5500
IBM 4500
W3Schools 7100

This SQL:

SELECT Company,SUM(Amount) FROM Sales
GROUP BY Company
HAVING SUM(Amount)>10000

Returns this result

Company SUM(Amount)
W3Schools 12600

27. FUNCTIONS

SQL has a lot of built-in functions for counting and calculations.


Function Syntax

The syntax for built-in SQL functions is:

SELECT function(column) FROM table


Types of Functions

There are several basic types and categories of functions in SQL. The basic types of functions are:

  • Aggregate Functions
  • Scalar functions

Aggregate functions

Aggregate functions operate against a collection of values, but return a single value.

Note: If used among many other expressions in the item list of a SELECT statement, the SELECT must have a GROUP BY clause!!

"Persons" table (used in most examples)

Name Age
Hansen, Ola 34
Svendson, Tove 45
Pettersen, Kari 19

Aggregate functions in MS Access

Function Description
AVG(column) Returns the average value of a column
COUNT(column) Returns the number of rows (without a NULL value) of a column
COUNT(*) Returns the number of selected rows
FIRST(column) Returns the value of the first record in a specified field
LAST(column) Returns the value of the last record in a specified field
MAX(column) Returns the highest value of a column
MIN(column) Returns the lowest value of a column
STDEV(column)
STDEVP(column)
SUM(column) Returns the total sum of a column
VAR(column)
VARP(column)

Aggregate functions in SQL Server

Function Description
AVG(column) Returns the average value of a column
BINARY_CHECKSUM
CHECKSUM
CHECKSUM_AGG
COUNT(column) Returns the number of rows (without a NULL value) of a column
COUNT(*) Returns the number of selected rows
COUNT(DISTINCT column) Returns the number of distinct results
FIRST(column) Returns the value of the first record in a specified field (not supported in SQLServer2K)
LAST(column) Returns the value of the last record in a specified field (not supported in SQLServer2K)
MAX(column) Returns the highest value of a column
MIN(column) Returns the lowest value of a column
STDEV(column)
STDEVP(column)
SUM(column) Returns the total sum of a column
VAR(column)
VARP(column)


Scalar functions

Scalar functions operate against a single value, and return a single value based on the input value.

Useful Scalar Functions in MS Access

Function Description
UCASE(c) Converts a field to upper case
LCASE(c) Converts a field to lower case
MID(c,start[,end]) Extract characters from a text field
LEN(c) Returns the length of a text field
INSTR(c,char) Returns the numeric position of a named character within a text field
LEFT(c,number_of_char) Return the left part of a text field requested
RIGHT(c,number_of_char) Return the right part of a text field requested
ROUND(c,decimals) Rounds a numeric field to the number of decimals specified
MOD(x,y) Returns the remainder of a division operation
NOW() Returns the current system date
FORMAT(c,format) Changes the way a field is displayed
DATEDIFF(d,date1,date2) Used to perform date calculations

26. ALTER

ALTER TABLE

The ALTER TABLE statement is used to add or drop columns in an existing table.

ALTER TABLE table_name 
ADD column_name datatype
ALTER TABLE table_name 
DROP COLUMN column_name

Note: Some database systems don't allow the dropping of a column in a database table (DROP COLUMN column_name).


Person:

LastName FirstName Address
Pettersen Kari Storgt 20


Example

To add a column named "City" in the "Person" table:

ALTER TABLE Person ADD City varchar(30)

Result:

LastName FirstName Address City
Pettersen Kari Storgt 20

Example

To drop the "Address" column in the "Person" table:

ALTER TABLE Person DROP COLUMN Address

Result:

LastName FirstName City
Pettersen Kari

25. DROP

Drop Index

You can delete an existing index in a table with the DROP INDEX statement.

Syntax for Microsoft SQLJet (and Microsoft Access):

DROP INDEX index_name ON table_name

Syntax for MS SQL Server:

DROP INDEX table_name.index_name

Syntax for IBM DB2 and Oracle:

DROP INDEX index_name

Syntax for MySQL:

ALTER TABLE table_name DROP INDEX index_name


Delete a Table or Database

To delete a table (the table structure, attributes, and indexes will also be deleted):

DROP TABLE table_name

To delete a database:

DROP DATABASE database_name


Truncate a Table

What if we only want to get rid of the data inside a table, and not the table itself? Use the TRUNCATE TABLE command (deletes only the data inside the table):

TRUNCATE TABLE table_name

24. CREATE

Create a Database

To create a database:

CREATE DATABASE database_name


Create a Table

To create a table in a database:

CREATE TABLE table_name
(
column_name1 data_type,
column_name2 data_type,
.......

)

Example

This example demonstrates how you can create a table named "Person", with four columns. The column names will be "LastName", "FirstName", "Address", and "Age":

CREATE TABLE Person
(
LastName varchar,
FirstName varchar,
Address varchar,
Age int
)

This example demonstrates how you can specify a maximum length for some columns:

CREATE TABLE Person
(
LastName varchar(30),
FirstName varchar,
Address varchar,
Age int(3)
)

The data type specifies what type of data the column can hold. The table below contains the most common data types in SQL:

Data Type Description
integer(size)
int(size)
smallint(size)
tinyint(size)
Hold integers only. The maximum number of digits are specified in parenthesis.
decimal(size,d)
numeric(size,d)
Hold numbers with fractions. The maximum number of digits are specified in "size". The maximum number of digits to the right of the decimal is specified in "d".
char(size) Holds a fixed length string (can contain letters, numbers, and special characters). The fixed size is specified in parenthesis.
varchar(size) Holds a variable length string (can contain letters, numbers, and special characters). The maximum size is specified in parenthesis.
date(yyyymmdd) Holds a date


Create Index

Indices are created in an existing table to locate rows more quickly and efficiently. It is possible to create an index on one or more columns of a table, and each index is given a name. The users cannot see the indexes, they are just used to speed up queries.

Note: Updating a table containing indexes takes more time than updating a table without, this is because the indexes also need an update. So, it is a good idea to create indexes only on columns that are often used for a search.

A Unique Index

Creates a unique index on a table. A unique index means that two rows cannot have the same index value.

CREATE UNIQUE INDEX index_name
ON table_name (column_name)

The "column_name" specifies the column you want indexed.

A Simple Index

Creates a simple index on a table. When the UNIQUE keyword is omitted, duplicate values are allowed.

CREATE INDEX index_name
ON table_name (column_name)

The "column_name" specifies the column you want indexed.

Example

This example creates a simple index, named "PersonIndex", on the LastName field of the Person table:

CREATE INDEX PersonIndex
ON Person (LastName)

If you want to index the values in a column in descending order, you can add the reserved word DESC after the column name:

CREATE INDEX PersonIndex
ON Person (LastName DESC)

If you want to index more than one column you can list the column names within the parentheses, separated by commas:

CREATE INDEX PersonIndex
ON Person (LastName, FirstName)

23. UNION

UNION

The UNION command is used to select related information from two tables, much like the JOIN command. However, when using the UNION command all selected columns need to be of the same data type.

Note: With UNION, only distinct values are selected.

SQL Statement 1
UNION
SQL Statement 2


Employees_Norway:

E_ID E_Name
01 Hansen, Ola
02 Svendson, Tove
03 Svendson, Stephen
04 Pettersen, Kari

Employees_USA:

E_ID E_Name
01 Turner, Sally
02 Kent, Clark
03 Svendson, Stephen
04 Scott, Stephen


Using the UNION Command

Example

List all different employee names in Norway and USA:

SELECT E_Name FROM Employees_Norway
UNION
SELECT E_Name FROM Employees_USA

Result

E_Name
Hansen, Ola
Svendson, Tove
Svendson, Stephen
Pettersen, Kari
Turner, Sally
Kent, Clark
Scott, Stephen

Note: This command cannot be used to list all employees in Norway and USA. In the example above we have two employees with equal names, and only one of them is listed. The UNION command only selects distinct values.


UNION ALL

The UNION ALL command is equal to the UNION command, except that UNION ALL selects all values.

SQL Statement 1
UNION ALL
SQL Statement 2


Using the UNION ALL Command

Example

List all employees in Norway and USA:

SELECT E_Name FROM Employees_Norway
UNION ALL
SELECT E_Name FROM Employees_USA

Result

E_Name
Hansen, Ola
Svendson, Tove
Svendson, Stephen
Pettersen, Kari
Turner, Sally
Kent, Clark
Svendson, Stephen
Scott, Stephen

22. JOINS

oins and Keys

Sometimes we have to select data from two or more tables to make our result complete. We have to perform a join.

Tables in a database can be related to each other with keys. A primary key is a column with a unique value for each row. Each primary key value must be unique within the table. The purpose is to bind data together, across tables, without repeating all of the data in every table.

In the "Employees" table below, the "Employee_ID" column is the primary key, meaning that no two rows can have the same Employee_ID. The Employee_ID distinguishes two persons even if they have the same name.

When you look at the example tables below, notice that:

  • The "Employee_ID" column is the primary key of the "Employees" table
  • The "Prod_ID" column is the primary key of the "Orders" table
  • The "Employee_ID" column in the "Orders" table is used to refer to the persons in the "Employees" table without using their names

Employees:

Employee_ID Name
01 Hansen, Ola
02 Svendson, Tove
03 Svendson, Stephen
04 Pettersen, Kari

Orders:

Prod_ID Product Employee_ID
234 Printer 01
657 Table 03
865 Chair 03


Referring to Two Tables

We can select data from two tables by referring to two tables, like this:

Example

Who has ordered a product, and what did they order?

SELECT Employees.Name, Orders.Product
FROM Employees, Orders
WHERE Employees.Employee_ID=Orders.Employee_ID

Result

Name Product
Hansen, Ola Printer
Svendson, Stephen Table
Svendson, Stephen Chair

Example

Who ordered a printer?

SELECT Employees.Name
FROM Employees, Orders
WHERE Employees.Employee_ID=Orders.Employee_ID
AND Orders.Product='Printer'

Result

Name
Hansen, Ola


Using Joins

OR we can select data from two tables with the JOIN keyword, like this:

Example INNER JOIN

Syntax

SELECT field1, field2, field3
FROM first_table
INNER JOIN second_table
ON first_table.keyfield = second_table.foreign_keyfield

Who has ordered a product, and what did they order?

SELECT Employees.Name, Orders.Product
FROM Employees
INNER JOIN Orders
ON Employees.Employee_ID=Orders.Employee_ID

The INNER JOIN returns all rows from both tables where there is a match. If there are rows in Employees that do not have matches in Orders, those rows will not be listed.

Result

Name Product
Hansen, Ola Printer
Svendson, Stephen Table
Svendson, Stephen Chair

Example LEFT JOIN

Syntax

SELECT field1, field2, field3
FROM first_table
LEFT JOIN second_table
ON first_table.keyfield = second_table.foreign_keyfield

List all employees, and their orders - if any.

SELECT Employees.Name, Orders.Product
FROM Employees
LEFT JOIN Orders
ON Employees.Employee_ID=Orders.Employee_ID

The LEFT JOIN returns all the rows from the first table (Employees), even if there are no matches in the second table (Orders). If there are rows in Employees that do not have matches in Orders, those rows also will be listed.

Result

Name Product
Hansen, Ola Printer
Svendson, Tove
Svendson, Stephen Table
Svendson, Stephen Chair
Pettersen, Kari

Example RIGHT JOIN

Syntax

SELECT field1, field2, field3
FROM first_table
RIGHT JOIN second_table
ON first_table.keyfield = second_table.foreign_keyfield

List all orders, and who has ordered - if any.

SELECT Employees.Name, Orders.Product
FROM Employees
RIGHT JOIN Orders
ON Employees.Employee_ID=Orders.Employee_ID

The RIGHT JOIN returns all the rows from the second table (Orders), even if there are no matches in the first table (Employees). If there had been any rows in Orders that did not have matches in Employees, those rows also would have been listed.

Result

Name Product
Hansen, Ola Printer
Svendson, Stephen Table
Svendson, Stephen Chair

Example

Who ordered a printer?

SELECT Employees.Name
FROM Employees
INNER JOIN Orders
ON Employees.Employee_ID=Orders.Employee_ID
WHERE Orders.Product = 'Printer'

Result

Name
Hansen, Ola

21. ALIASES

Column Name Alias

The syntax is:

SELECT column AS column_alias FROM table


Table Name Alias

The syntax is:

SELECT column FROM table AS table_alias


Example: Using a Column Alias

This table (Persons):

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Svendson Tove Borgvn 23 Sandnes
Pettersen Kari Storgt 20 Stavanger

And this SQL:

SELECT LastName AS Family, FirstName AS Name
FROM Persons

Returns this result:

Family Name
Hansen Ola
Svendson Tove
Pettersen Kari


Example: Using a Table Alias

This table (Persons):

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Svendson Tove Borgvn 23 Sandnes
Pettersen Kari Storgt 20 Stavanger

And this SQL:

SELECT LastName, FirstName
FROM Persons AS Employees

Returns this result:

Table Employees:

LastName FirstName
Hansen Ola
Svendson Tove
Pettersen Kari

20. BETWEEN

BETWEEN ... AND

The BETWEEN ... AND operator selects a range of data between two values. These values can be numbers, text, or dates.

SELECT column_name FROM table_name
WHERE column_name
BETWEEN value1 AND value2


Original Table (used in the examples)

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Nordmann Anna Neset 18 Sandnes
Pettersen Kari Storgt 20 Stavanger
Svendson Tove Borgvn 23 Sandnes


Example 1

To display the persons alphabetically between (and including) "Hansen" and exclusive "Pettersen", use the following SQL:

SELECT * FROM Persons WHERE LastName
BETWEEN 'Hansen' AND 'Pettersen'

Result:

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Nordmann Anna Neset 18 Sandnes

IMPORTANT! The BETWEEN...AND operator is treated differently in different databases. With some databases a person with the LastName of "Hansen" or "Pettersen" will not be listed (BETWEEN..AND only selects fields that are between and excluding the test values). With some databases a person with the last name of "Hansen" or "Pettersen" will be listed (BETWEEN..AND selects fields that are between and including the test values). With other databases a person with the last name of "Hansen" will be listed, but "Pettersen" will not be listed (BETWEEN..AND selects fields between the test values, including the first test value and excluding the last test value). Therefore: Check how your database treats the BETWEEN....AND operator!


Example 2

To display the persons outside the range used in the previous example, use the NOT operator:

SELECT * FROM Persons WHERE LastName
NOT BETWEEN 'Hansen' AND 'Pettersen'

Result:

LastName FirstName Address City
Pettersen Kari Storgt 20 Stavanger
Svendson Tove Borgvn 23 Sandnes

19. IN

IN

The IN operator may be used if you know the exact value you want to return for at least one of the columns.

SELECT column_name FROM table_name
WHERE column_name IN (value1,value2,..)


Original Table (used in the examples)

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Nordmann Anna Neset 18 Sandnes
Pettersen Kari Storgt 20 Stavanger
Svendson Tove Borgvn 23 Sandnes


Example 1

To display the persons with LastName equal to "Hansen" or "Pettersen", use the following SQL:

SELECT * FROM Persons
WHERE LastName IN ('Hansen','Pettersen')

Result:

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Pettersen Kari Storgt 20 Stavanger

18. AND/OR

ND & OR

AND and OR join two or more conditions in a WHERE clause.

The AND operator displays a row if ALL conditions listed are true. The OR operator displays a row if ANY of the conditions listed are true.


Original Table (used in the examples)

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Svendson Tove Borgvn 23 Sandnes
Svendson Stephen Kaivn 18 Sandnes


Example

Use AND to display each person with the first name equal to "Tove", and the last name equal to "Svendson":

SELECT * FROM Persons
WHERE FirstName='Tove'
AND LastName='Svendson'

Result:

LastName FirstName Address City
Svendson Tove Borgvn 23 Sandnes

Example

Use OR to display each person with the first name equal to "Tove", or the last name equal to "Svendson":

SELECT * FROM Persons
WHERE firstname='Tove'
OR lastname='Svendson'

Result:

LastName FirstName Address City
Svendson Tove Borgvn 23 Sandnes
Svendson Stephen Kaivn 18 Sandnes

Example

You can also combine AND and OR (use parentheses to form complex expressions):

SELECT * FROM Persons WHERE
(FirstName='Tove' OR FirstName='Stephen')
AND LastName='Svendson'

Result:

LastName FirstName Address City
Svendson Tove Borgvn 23 Sandnes
Svendson Stephen Kaivn 18 Sandnes

17. ORDER BY

he ORDER BY keyword is used to sort the result.

Sort the Rows

The ORDER BY clause is used to sort the rows.

Orders:

Company OrderNumber
Sega 3412
ABC Shop 5678
W3Schools 6798
W3Schools 2312

Example

To display the company names in alphabetical order:

SELECT Company, OrderNumber FROM Orders
ORDER BY Company

Result:

Company OrderNumber
ABC Shop 5678
Sega 3412
W3Schools 6798
W3Schools 2312

Example

To display the company names in alphabetical order AND the OrderNumber in numerical order:

SELECT Company, OrderNumber FROM Orders
ORDER BY Company, OrderNumber

Result:

Company OrderNumber
ABC Shop 5678
Sega 3412
W3Schools 2312
W3Schools 6798

Example

To display the company names in reverse alphabetical order:

SELECT Company, OrderNumber FROM Orders
ORDER BY Company DESC

Result:

Company OrderNumber
W3Schools 6798
W3Schools 2312
Sega 3412
ABC Shop 5678

Example

To display the company names in reverse alphabetical order AND the OrderNumber in numerical order:

SELECT Company, OrderNumber FROM Orders
ORDER BY Company DESC, OrderNumber ASC

Result:

Company OrderNumber
W3Schools 2312
W3Schools 6798
Sega 3412
ABC Shop 5678
Notice that there are two equal company names (W3Schools) in the result above. The only time you will see the second column in ASC order would be when there are duplicated values in the first sort column, or a handful of nulls

16. DELETE

The DELETE Statement

The DELETE statement is used to delete rows in a table.

Syntax

DELETE FROM table_name
WHERE column_name = some_value


Person:

LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger
Rasmussen Nina Stien 12 Stavanger


Delete a Row

"Nina Rasmussen" is going to be deleted:

DELETE FROM Person WHERE LastName = 'Rasmussen'

Result

LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger


Delete All Rows

It is possible to delete all rows in a table without deleting the table. This means that the table structure, attributes, and indexes will be intact:

DELETE FROM table_name
or
DELETE * FROM table_name

15. UPDATE

he Update Statement

The UPDATE statement is used to modify the data in a table.

Syntax

UPDATE table_name
SET column_name = new_value
WHERE column_name = some_value


Person:

LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger
Rasmussen Storgt 67


Update one Column in a Row

We want to add a first name to the person with a last name of "Rasmussen":

UPDATE Person SET FirstName = 'Nina'
WHERE LastName = 'Rasmussen'

Result:

LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger
Rasmussen Nina Storgt 67


Update several Columns in a Row

We want to change the address and add the name of the city:

UPDATE Person
SET Address = 'Stien 12', City = 'Stavanger'
WHERE LastName = 'Rasmussen'

Result:

LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger
Rasmussen Nina Stien 12 Stavanger

14. INSERT INTO

he INSERT INTO Statement

The INSERT INTO statement is used to insert new rows into a table.

Syntax

INSERT INTO table_name
VALUES (value1, value2,....)

You can also specify the columns for which you want to insert data:

INSERT INTO table_name (column1, column2,...)
VALUES (value1, value2,....)


Insert a New Row

This "Persons" table:

LastName FirstName Address City
Pettersen Kari Storgt 20 Stavanger

And this SQL statement:

INSERT INTO Persons
VALUES ('Hetland', 'Camilla', 'Hagabakka 24', 'Sandnes')

Will give this result:

LastName FirstName Address City
Pettersen Kari Storgt 20 Stavanger
Hetland Camilla Hagabakka 24 Sandnes


Insert Data in Specified Columns

This "Persons" table:

LastName FirstName Address City
Pettersen Kari Storgt 20 Stavanger
Hetland Camilla Hagabakka 24 Sandnes

And This SQL statement:

INSERT INTO Persons (LastName, Address)
VALUES ('Rasmussen', 'Storgt 67')

Will give this result:

LastName FirstName Address City
Pettersen Kari Storgt 20 Stavanger
Hetland Camilla Hagabakka 24 Sandnes
Rasmussen
Storgt 67

13. WHERE CLAUSE

The WHERE Clause

To conditionally select data from a table, a WHERE clause can be added to the SELECT statement.

Syntax

SELECT column FROM table
WHERE column operator value

With the WHERE clause, the following operators can be used:

Operator Description
= Equal
<> Not equal
> Greater than
< Less than
>= Greater than or equal
<= Less than or equal
BETWEEN Between an inclusive range
LIKE

Search for a pattern

IN If you know the exact value you want to return for at least one of the columns

Note: In some versions of SQL the <> operator may be written as !=


Using the WHERE Clause

To select only the persons living in the city "Sandnes", we add a WHERE clause to the SELECT statement:

SELECT * FROM Persons
WHERE City='Sandnes'

"Persons" table

LastName FirstName Address City Year
Hansen Ola Timoteivn 10 Sandnes 1951
Svendson Tove Borgvn 23 Sandnes 1978
Svendson Stale Kaivn 18 Sandnes 1980
Pettersen Kari Storgt 20 Stavanger 1960

Result

LastName FirstName Address City Year
Hansen Ola Timoteivn 10 Sandnes 1951
Svendson Tove Borgvn 23 Sandnes 1978
Svendson Stale Kaivn 18 Sandnes 1980


Using Quotes

Note that we have used single quotes around the conditional values in the examples.

SQL uses single quotes around text values (most database systems will also accept double quotes). Numeric values should not be enclosed in quotes.

For text values:

This is correct:
SELECT * FROM Persons WHERE FirstName='Tove'
This is wrong:
SELECT * FROM Persons WHERE FirstName=Tove

For numeric values:

This is correct:
SELECT * FROM Persons WHERE Year>1965
This is wrong:
SELECT * FROM Persons WHERE Year>'1965'


The LIKE Condition

The LIKE condition is used to specify a search for a pattern in a column.

Syntax

SELECT column FROM table
WHERE column LIKE pattern

A "%" sign can be used to define wildcards (missing letters in the pattern) both before and after the pattern.


Using LIKE

The following SQL statement will return persons with first names that start with an 'O':

SELECT * FROM Persons
WHERE FirstName LIKE 'O%'

The following SQL statement will return persons with first names that end with an 'a':

SELECT * FROM Persons
WHERE FirstName LIKE '%a'

The following SQL statement will return persons with first names that contain the pattern 'la':

SELECT * FROM Persons
WHERE FirstName LIKE '%la%'