2. Shell & Command Line#
1. ls - List Directory Contents#
Basic Usage:#
ls # list files in current directory
ls /path/to/dir # list files in specific directory
All Important Flags:#
ls -a # list ALL files including hidden ✅ (exam answer)
ls -l # long format (permissions, size, date, owner)
ls -h # human-readable sizes (KB, MB, GB) - NOT hidden files ❌
ls -la # long format + hidden files combined
ls -lh # long format + human-readable sizes
ls -lt # sort by modification time
ls -lr # reverse order
ls -lS # sort by file size
ls -R # recursive (list subdirectories too)
Hidden Files:#
- Start with
. (dot) - Examples:
.gitignore, .env, .bashrc, .ssh/ ls without -a → hidden files NOT shownls -a → ALL files shown including hidden ✅- ❌
ls -h → human-readable sizes (NOT hidden) - ❌
show -a → not a valid Unix command - ❌
dir /ah → Windows command only
ls -la
# Output:
# drwxr-xr-x 5 user group 4096 Dec 14 16:45 .
# drwxr-xr-x 20 user group 4096 Dec 13 10:22 ..
# -rw-r--r-- 1 user group 220 Dec 14 09:00 .bashrc ← hidden
# -rw-r--r-- 1 user group 156 Dec 14 09:00 .gitignore ← hidden
# drwxr-xr-x 2 user group 4096 Dec 14 16:45 data/
# -rw-r--r-- 1 user group 2048 Dec 14 16:45 script.py
#
# Columns: permissions | links | owner | group | size | date | name
2. find - Search for Files by NAME#
# Find files by name
find . -name "*.py" # all Python files
find . -name "data.csv" # specific file
find /home -name "*.log" # in specific directory
find . -name "Pine Ridge" # ❌ finds FILE named "Pine Ridge"
# NOT content inside files
# Find by type
find . -type f -name "*.csv" # files only
find . -type d -name "data" # directories only
# Find by size
find . -size +10M # files larger than 10MB
find . -size -1k # files smaller than 1KB
# Find and execute command
find . -name "*.log" -exec cat {} \; # cat all log files
find . -name "*.pyc" -exec rm {} \; # delete all .pyc files
# Find modified recently
find . -mtime -7 # modified in last 7 days
find . -mtime +30 # modified more than 30 days ago
find vs grep:#
| Command | Searches | Example |
|---|
find | File NAMES | find . -name "*.log" |
grep | File CONTENTS | grep "error" logfile.txt |
3. cat - Display File Contents#
cat file.txt # display entire file
cat file1.txt file2.txt # display multiple files
cat -n file.txt # display with line numbers
cat > newfile.txt # create new file (type content, Ctrl+D to end)
cat >> file.txt # append to file
- ❌ Does NOT search content (use
grep) - ❌ NOT suitable for very large files (use
less or head)
4. grep - Search Text INSIDE Files#
Basic Usage:#
grep "search_term" file.txt # search in file ✅
grep "Pine Ridge" evacuation.txt # find all lines with "Pine Ridge"
grep "error" app.log # find error lines in log
Important Flags:#
grep -i "term" file.txt # case-insensitive search
grep -n "term" file.txt # show line numbers
grep -v "term" file.txt # invert - lines NOT containing term
grep -c "term" file.txt # count matching lines
grep -l "term" *.txt # list files containing term
grep -r "term" ./dir/ # recursive search in directory
grep -w "term" file.txt # whole word match only
grep -A 2 "term" file.txt # show 2 lines AFTER match
grep -B 2 "term" file.txt # show 2 lines BEFORE match
grep -E "pattern" file.txt # extended regex ✅ (exam relevant)
grep with Regex:#
grep -E "[0-9]+" file.txt # lines with numbers
grep -E "^ERROR" file.txt # lines starting with ERROR
grep -E "\.py$" file.txt # lines ending with .py
grep -E "(GET|POST)" access.log # lines with GET or POST
grep -E "1[0-5][0-9]" file.txt # match time range pattern
Basic Usage:#
awk '{print $1}' file.txt # print first field
awk '{print $3, $7}' file.txt # print 3rd and 7th fields ✅
awk -F',' '{print $2}' file.csv # use comma as delimiter
awk '{print NR, $0}' file.txt # print with line numbers
Common Patterns:#
# Print specific fields from log
awk '{print $1, $4, $7}' access.log
# Filter and print
awk '$9 == 200 {print $0}' access.log # lines where 9th field = 200
# Calculate sum
awk '{sum += $5} END {print sum}' file.txt
# Print lines matching condition
awk '$7 ~ /\/checkout/' access.log # URL contains /checkout/
# Count occurrences
awk '{count[$7]++} END {for (url in count) print count[url], url}' log
Special Variables:#
| Variable | Meaning |
|---|
$0 | Entire line |
$1, $2… | Field 1, 2… |
NR | Current line number |
NF | Number of fields |
FS | Field separator |
6. sort - Sort Output#
sort file.txt # alphabetical sort
sort -n file.txt # numeric sort
sort -r file.txt # reverse order
sort -nr file.txt # numeric reverse ✅ (exam relevant - sort -nr)
sort -k2 file.txt # sort by 2nd field
sort -k2 -n file.txt # sort by 2nd field numerically
sort -u file.txt # sort and remove duplicates
sort -t',' -k2 file.csv # sort CSV by 2nd column
7. uniq - Count/Remove Duplicates#
uniq file.txt # remove consecutive duplicates
uniq -c file.txt # count occurrences ✅ (exam relevant)
uniq -d file.txt # show only duplicates
uniq -u file.txt # show only unique lines
# IMPORTANT: uniq only works on SORTED input
# Always sort before uniq:
sort file.txt | uniq -c
8. head and tail - View File Portions#
head file.txt # first 10 lines (default)
head -n 20 file.txt # first 20 lines
head -1000 file.txt # first 1000 lines
tail file.txt # last 10 lines
tail -n 20 file.txt # last 20 lines
tail -f file.txt # follow file in real-time (for logs) ✅
9. wc - Word/Line/Character Count#
wc file.txt # lines, words, characters
wc -l file.txt # count lines only ✅
wc -w file.txt # count words only
wc -c file.txt # count bytes
wc -m file.txt # count characters
# Count lines matching pattern
grep "ERROR" app.log | wc -l
10. Shell Pipeline - Combining Commands#
The | (pipe) operator:#
- Takes output of left command as input to right command
- Can chain multiple commands together
Complete Log Analysis Pipeline:#
# ✅ Exam-relevant pipeline for large log file analysis:
grep -E "(0[89]|1[0-8]):[0-5][0-9]" traffic.log \
| awk '{print $3, $7}' \
| sort \
| uniq -c \
| sort -nr
# Breakdown:
# grep -E "..." → filter lines matching time pattern
# awk '{...}' → extract fields 3 and 7
# sort → sort alphabetically (needed before uniq)
# uniq -c → count unique occurrences
# sort -nr → sort by count, highest first
Other Useful Pipelines:#
# Count HTTP status codes in log
awk '{print $9}' access.log | sort | uniq -c | sort -nr
# Find most common IP addresses
awk '{print $1}' access.log | sort | uniq -c | sort -nr | head -10
# Find all POST requests
grep "POST" access.log | awk '{print $7}' | sort | uniq -c | sort -nr
# Count errors by hour
grep "ERROR" app.log | awk '{print substr($2,1,2)}' | sort | uniq -c
# Search specific pattern and count
grep -c "404" access.log
# Extract unique URLs
awk '{print $7}' access.log | sort -u
11. dir /ah - Windows Command (NOT Unix)#
Windows: dir /ah ← lists hidden files in Windows
Unix: ls -a ← lists hidden files in Unix/Linux/Mac
❌ dir /ah does NOT work on Unix/Linux/Mac
❌ show -a does not exist in any standard shell
12. Other Essential Shell Commands#
Directory Navigation:#
pwd # print working directory
cd /path/to/dir # change directory
cd .. # go up one level
cd ~ # go to home directory
cd - # go to previous directory
mkdir new_folder # create directory
mkdir -p a/b/c # create nested directories
rmdir empty_folder # remove empty directory
rm -rf folder/ # remove directory and contents (careful!)
File Operations:#
cp source.txt dest.txt # copy file
cp -r source/ dest/ # copy directory recursively
mv old.txt new.txt # move/rename file
rm file.txt # remove file
rm -f file.txt # force remove (no confirmation)
touch newfile.txt # create empty file or update timestamp
Viewing & Searching:#
less file.txt # view file page by page (better than cat for large files)
more file.txt # similar to less
nano file.txt # simple text editor
vi file.txt # powerful text editor
# Search in less:
# /search_term → forward search
# ?search_term → backward search
# n → next match
# q → quit
Process Management:#
ps aux # list all running processes
kill -9 PID # force kill process
top # real-time process monitor
htop # improved process monitor
File Permissions:#
chmod 755 script.py # set permissions (rwxr-xr-x)
chmod +x script.py # make executable
chmod 644 file.txt # rw-r--r--
chown user:group file.txt # change owner
# Permission notation:
# r=4, w=2, x=1
# 755 → rwxr-xr-x (owner: all, group+others: read+execute)
# 644 → rw-r--r-- (owner: read+write, others: read only)
Network & Downloads:#
curl https://api.example.com/data # make HTTP request
curl -O https://example.com/file.zip # download file
wget https://example.com/file.zip # download file
curl -H "Authorization: Bearer TOKEN" url # with header
curl -X POST -d '{"key":"value"}' url # POST request
Environment Variables:#
export API_KEY="your_key" # set environment variable
echo $API_KEY # print variable
env # list all environment variables
unset API_KEY # remove variable
printenv API_KEY # print specific variable
13. Shell Scripting Basics#
Shebang and Basic Script:#
#!/bin/bash
# This is a comment
# Variables
NAME="John"
echo "Hello, $NAME"
# User input
read -p "Enter name: " name
echo "Hello, $name"
# Conditionals
if [ -f "file.txt" ]; then
echo "File exists"
elif [ -d "folder" ]; then
echo "Folder exists"
else
echo "Neither exists"
fi
# Loops
for i in {1..5}; do
echo "Number: $i"
done
for file in *.csv; do
echo "Processing: $file"
done
# While loop
while IFS= read -r line; do
echo "$line"
done < file.txt
ETL Script with Idempotency:#
#!/bin/bash
# Idempotent ETL script
TEMP_FILE="/tmp/output_temp.csv"
FINAL_FILE="/output/final.csv"
LOG_FILE="/logs/etl.log"
echo "$(date): Starting ETL" >> $LOG_FILE
# Write to temp file first
python process_data.py > $TEMP_FILE
# Check if processing succeeded
if [ $? -eq 0 ]; then
# Atomic move to final destination
mv $TEMP_FILE $FINAL_FILE # ✅ atomic operation
echo "$(date): ETL completed" >> $LOG_FILE
else
echo "$(date): ETL failed" >> $LOG_FILE
rm -f $TEMP_FILE
exit 1
fi
14. grep for Log Analysis - Exam Scenarios#
# ✅ Correct: grep searches CONTENT
grep "Pine Ridge" evacuation-logs.txt
# ❌ Wrong: find searches FILE NAMES
find -name "Pine Ridge"
# ❌ Wrong: ls lists directory contents
ls "Pine Ridge"
# ❌ Wrong: cat shows everything, no search
cat evacuation-logs.txt
Scenario 2: Search across multiple files#
grep -r "ERROR" ./logs/ # recursive search
grep "404" *.log # all log files
grep -l "critical" ./logs/* # list files containing "critical"
# POST requests to /checkout/
grep "POST /checkout" access.log
# Requests between 12:00-15:59
grep -E "1[2-5]:[0-5][0-9]" access.log
# Combine with awk for field extraction
grep "POST" access.log | awk '{print $1, $4, $7, $9}'
Shell Commands - Quick Reference Card#
Navigation & Files:
pwd → current directory
ls -a → list all files (including hidden) ✅
ls -la → long format + hidden
cd path → change directory
find → search FILE NAMES
Viewing Content:
cat → display file (small files)
less → view file page by page (large files)
head -n N → first N lines
tail -n N → last N lines
tail -f → follow file in real-time
Searching:
grep "term" file → search CONTENT ✅
grep -i → case-insensitive
grep -E "pattern" → extended regex
grep -r "term" dir → recursive search
Processing:
awk '{print $N}' → extract field N
sort → sort output
sort -nr → numeric reverse sort
uniq -c → count unique ✅
wc -l → count lines
Pipeline:
cmd1 | cmd2 → pipe output to next command
grep | awk | sort | uniq -c | sort -nr ✅
Log Analysis Pattern:
grep -E "pattern" log | awk '{print $field}' | sort | uniq -c | sort -nr