数据结构week10


Week10排序(下)

10.1 快速排序

算法概述

算法概述

void Quicksort(ElementType A[], int N) {
    if (N < 2) return;
    pivot = 从A[] 中选一个主元;
    将S = {A[] \ pivot} 分成2个独立子集: 
    A1 = {a属于S | a ≤ pivot} 和 A2 = {a属于S | a ≥ pivot};
    A[] = Quicksort(A1, N1){ pivot }Quicksort(A2, N2);
}

什么是快速排序算法的最好情况?

每次正好中分$\rightarrow T(N) = O( NlogN ) $

选主元

令 $pivot = A[0]$?

1 2 3 4 5 6 $\cdots$ N-1 N
2 3 4 5 6 $\cdots$ N-1 N
3 4 5 6 $\cdots$ N-1 N

$$
\begin{aligned}
T ( N ) &= O( N ) + T ( N – 1 )\newline
&= O( N ) + O ( N – 1 ) + T( N – 2 )\newline
&= O( N ) + O ( N – 1 ) + … + O( 1 )\newline
&= O( N^2 )
\end{aligned}
$$

随机取 pivot? rand()函数不便宜啊!

取头、中、尾的中位数

ElementType Median3(ElementType A[], int Left, int Right) {
    int Center = (Left + Right) / 2;
    if (A[Left] > A[Center]) Swap(&A[Left], &A[Center]);
    if (A[Left] > A[Right]) Swap(&A[Left], &A[Right]);
    if (A[Center] > A[Right]) Swap(&A[Center], &A[Right]);
    /* A[ Left ] <= A[ Center ] <= A[ Right ] */
    Swap(&A[Center], &A[Right - 1]); /* 将pivot藏到右边 */
    /* 只需要考虑 A[ Left+1 ] … A[ Right–2 ] */
    return A[Right - 1]; /* 返回 pivot */
}

子集划分

如果有元素正好等于pivot怎么办?

停下来交换!

小规模数据的处理

快速排序的问题

  • 用递归……

  • 对小规模的数据(例如N不到100)可能还不如插入排序快

  • 解决方案

  • 当递归的数据规模充分小,则停止递归,直接调用简单排序(例如插入排序)

  • 在程序中定义一个Cutoff的阈值

算法实现

void Quicksort(ElementType A[], int Left, int Right) {
    if (Cutoff <= Right - Left) {
        Pivot = Median3(A, Left, Right);
        i = Left;
        j = Right – 1;
        for (;;) {
            while (A[++i] < Pivot) {}
            while (A[ ––j] > Pivot) {}
            if (i < j)
                Swap(&A[i], &A[j]);
            else
                break;
        }
        Swap(&A[i], &A[Right - 1]);
        Quicksort(A, Left, i - 1);
        Quicksort(A, i + 1, Right);
    } else
        Insertion_Sort(A + Left, Right - Left + 1);
}
void Quick_Sort(ElementType A[], int N) { Quicksort(A, 0, N - 1); }

10.2 表排序

算法概述

间接排序

定义一个指针数组作为“表”(table)

A [0] [1] [2] [3] [4] [5] [6] [7]
key f d c a g b h e
table 3 5 2 1 7 0 4 6

物理排序

N个数字的排列由若干个独立的环组成

如何判断一个环的结束?

if ( table[i] == i )

复杂度分析

最好情况:初始即有序

最坏情况:

  • 有 $\lfloor N / 2\rfloor$ 个环,每个环包含2个元素
  • 需要$\lfloor 3N / 2\rfloor$次元素移动

$T = O( m N ) , m $​是每个A元素的复制时间。

10.3 基数排序

桶排序

假设我们有 N 个学生,他们的成绩是0到100之间的整数( 于是有$ M = 101$ 个不同的成绩值) 。 如何在线性时间内将学生按成绩排序?

void Bucket_Sort(ElementType A[], int N) {
    count[] 初始化;
    while (读入1个学生成绩grade)
		将该生插入count[grade]链表;
    for (i = 0; i < M; i++) {
        if (count[i]) 输出整个count[i] 链表;
    }
}

$T(N, M) = O( M+N ) $

如果 M >> N该怎么办?

基数排序

假设我们有 N = 10 个整数,每个整数的值在0到999之间( 于是有 M = 1000 个不同的值)。 还有
可能在线性时间内排序吗?

输 序列 入序列: 64, 8, 216, 512, 27, 729, 0, 1, 343, 125

用“次位优先”(Least Significant Digit)

$T=O(P(N+B))$

Bucket 0 1 2 3 4 5 6 7 8 9
Pass1 0 1 512 343 64 125 216 27 8 729
0 512 125 343 64
Pass2 1 216 27
8 729
0 125 216 343 512 729
1
Pass3 8
27
64

多关键字的排序

一副扑克牌是按2种关键字排序的:花色和面值

用“主位优先” ( Most Significant Digit )排序: 为花色建4个桶

在每个桶内分别排序, 最后合并结果。

用“次位优先” (Least Significant Digit )排序: 为面值建13个桶

将结果合并 然后再为花色建4个桶

10.4 排序算法的比较

排序方法 平均时间复杂度 最坏情况下时间复杂度 额外空间复杂度 稳定性
简单选择排序 $O(N^2)$ $O(N^2)$ $O(1)$ 不稳定
冒泡排序 $O(N^2)$ $O(N^2)$ $O(1)$ 稳定
直接插入排序 $O(N^2) $ $O(N^2)$ $O(1)$ 稳定
希尔排序 $O(N^d)$ $O(N^2)$ $O(1)$ 不稳定
堆排序 $O(NlogN)$ $O(NlogN)$ $O(1)$ 不稳定
快速排序 $O(NlogN)$ $O(N^2)$ $O(logN)$ 不稳定
归并排序 $O(NlogN)$ $O(NlogN)$ $O(N) $ 稳定
基数排序 $O(P(N+B))$ $O(P(N+B))$ $O(N+B)$ 稳定

代码模板

/* 快速排序 - 直接调用库函数 */

#include <stdlib.h>

/*---------------简单整数排序--------------------*/
int compare(const void *a, const void *b) { /* 比较两整数。非降序排列 */
    return (*(int *)a - *(int *)b);
}
/* 调用接口 */
qsort(A, N, sizeof(int), compare);
/*---------------简单整数排序--------------------*/

/*--------------- 一般情况下,对结构体Node中的某键值key排序 ---------------*/
struct Node {
    int key1, key2;
} A[MAXN];

int compare2keys(const void *a, const void *b) {
    /* 比较两种键值:按key1非升序排列;如果key1相等,则按key2非降序排列*/
    int k;
    if (((const struct Node *)a)->key1 < ((const struct Node *)b)->key1)
        k = 1;
    else if (((const struct Node *)a)->key1 > ((const struct Node *)b)->key1)
        k = -1;
    else { /* 如果key1相等 */
        if (((const struct Node *)a)->key2 < ((const struct Node *)b)->key2)
            k = -1;
        else
            k = 1;
    }
    return k;
}
/* 调用接口 */
qsort(A, N, sizeof(struct Node), compare2keys);
/*--------------- 一般情况下,对结构体Node中的某键值key排序 ---------------*/

/* 快速排序 */

ElementType Median3(ElementType A[], int Left, int Right) {
    int Center = (Left + Right) / 2;
    if (A[Left] > A[Center])
        Swap(&A[Left], &A[Center]);
    if (A[Left] > A[Right])
        Swap(&A[Left], &A[Right]);
    if (A[Center] > A[Right])
        Swap(&A[Center], &A[Right]);
    /* 此时A[Left] <= A[Center] <= A[Right] */
    Swap(&A[Center], &A[Right - 1]); /* 将基准Pivot藏到右边*/
    /* 只需要考虑A[Left+1] … A[Right-2] */
    return A[Right - 1]; /* 返回基准Pivot */
}

void Qsort(ElementType A[], int Left, int Right) { /* 核心递归函数 */
    int Pivot, Cutoff, Low, High;

    if (Cutoff <= Right - Left) { /* 如果序列元素充分多,进入快排 */
        Pivot = Median3(A, Left, Right); /* 选基准 */
        Low = Left;
        High = Right - 1;
        while (1) { /*将序列中比基准小的移到基准左边,大的移到右边*/
            while (A[++Low] < Pivot)
                ;
            while (A[--High] > Pivot)
                ;
            if (Low < High)
                Swap(&A[Low], &A[High]);
            else
                break;
        }
        Swap(&A[Low], &A[Right - 1]); /* 将基准换到正确的位置 */
        Qsort(A, Left, Low - 1);      /* 递归解决左边 */
        Qsort(A, Low + 1, Right);     /* 递归解决右边 */
    } else
        InsertionSort(A + Left, Right - Left + 1); /* 元素太少,用简单排序 */
}

void QuickSort(ElementType A[], int N) { /* 统一接口 */
    Qsort(A, 0, N - 1);
}

/* 基数排序 - 次位优先 */

/* 假设元素最多有MaxDigit个关键字,基数全是同样的Radix */
#define MaxDigit 4
#define Radix 10

/* 桶元素结点 */
typedef struct Node *PtrToNode;
struct Node {
    int key;
    PtrToNode next;
};

/* 桶头结点 */
struct HeadNode {
    PtrToNode head, tail;
};
typedef struct HeadNode Bucket[Radix];

int GetDigit(int X, int D) { /* 默认次位D=1, 主位D<=MaxDigit */
    int d, i;

    for (i = 1; i <= D; i++) {
        d = X % Radix;
        X /= Radix;
    }
    return d;
}

void LSDRadixSort(ElementType A[], int N) { /* 基数排序 - 次位优先 */
    int D, Di, i;
    Bucket B;
    PtrToNode tmp, p, List = NULL;

    for (i = 0; i < Radix; i++) /* 初始化每个桶为空链表 */
        B[i].head = B[i].tail = NULL;
    for (i = 0; i < N; i++) { /* 将原始序列逆序存入初始链表List */
        tmp = (PtrToNode)malloc(sizeof(struct Node));
        tmp->key = A[i];
        tmp->next = List;
        List = tmp;
    }
    /* 下面开始排序 */
    for (D = 1; D <= MaxDigit; D++) { /* 对数据的每一位循环处理 */
        /* 下面是分配的过程 */
        p = List;
        while (p) {
            Di = GetDigit(p->key, D); /* 获得当前元素的当前位数字 */
            /* 从List中摘除 */
            tmp = p;
            p = p->next;
            /* 插入B[Di]号桶尾 */
            tmp->next = NULL;
            if (B[Di].head == NULL)
                B[Di].head = B[Di].tail = tmp;
            else {
                B[Di].tail->next = tmp;
                B[Di].tail = tmp;
            }
        }
        /* 下面是收集的过程 */
        List = NULL;
        for (Di = Radix - 1; Di >= 0; Di--) { /* 将每个桶的元素顺序收集入List */
            if (B[Di].head) {                 /* 如果桶不为空 */
                /* 整桶插入List表头 */
                B[Di].tail->next = List;
                List = B[Di].head;
                B[Di].head = B[Di].tail = NULL; /* 清空桶 */
            }
        }
    }
    /* 将List倒入A[]并释放空间 */
    for (i = 0; i < N; i++) {
        tmp = List;
        List = List->next;
        A[i] = tmp->key;
        free(tmp);
    }
}

/* 基数排序 - 主位优先 */

/* 假设元素最多有MaxDigit个关键字,基数全是同样的Radix */

#define MaxDigit 4
#define Radix 10

/* 桶元素结点 */
typedef struct Node *PtrToNode;
struct Node {
    int key;
    PtrToNode next;
};

/* 桶头结点 */
struct HeadNode {
    PtrToNode head, tail;
};
typedef struct HeadNode Bucket[Radix];

int GetDigit(int X, int D) { /* 默认次位D=1, 主位D<=MaxDigit */
    int d, i;

    for (i = 1; i <= D; i++) {
        d = X % Radix;
        X /= Radix;
    }
    return d;
}

void MSD(ElementType A[],
         int L,
         int R,
         int D) { /* 核心递归函数: 对A[L]...A[R]的第D位数进行排序 */
    int Di, i, j;
    Bucket B;
    PtrToNode tmp, p, List = NULL;
    if (D == 0)
        return; /* 递归终止条件 */

    for (i = 0; i < Radix; i++) /* 初始化每个桶为空链表 */
        B[i].head = B[i].tail = NULL;
    for (i = L; i <= R; i++) { /* 将原始序列逆序存入初始链表List */
        tmp = (PtrToNode)malloc(sizeof(struct Node));
        tmp->key = A[i];
        tmp->next = List;
        List = tmp;
    }
    /* 下面是分配的过程 */
    p = List;
    while (p) {
        Di = GetDigit(p->key, D); /* 获得当前元素的当前位数字 */
        /* 从List中摘除 */
        tmp = p;
        p = p->next;
        /* 插入B[Di]号桶 */
        if (B[Di].head == NULL)
            B[Di].tail = tmp;
        tmp->next = B[Di].head;
        B[Di].head = tmp;
    }
    /* 下面是收集的过程 */
    i = j = L; /* i, j记录当前要处理的A[]的左右端下标 */
    for (Di = 0; Di < Radix; Di++) { /* 对于每个桶 */
        if (B[Di].head) { /* 将非空的桶整桶倒入A[], 递归排序 */
            p = B[Di].head;
            while (p) {
                tmp = p;
                p = p->next;
                A[j++] = tmp->key;
                free(tmp);
            }
            /* 递归对该桶数据排序, 位数减1 */
            MSD(A, i, j - 1, D - 1);
            i = j; /* 为下一个桶对应的A[]左端 */
        }
    }
}

void MSDRadixSort(ElementType A[], int N) { /* 统一接口 */
    MSD(A, 0, N - 1, MaxDigit);
}

编程习题

1.统计工龄

给定公司N名员工的工龄,要求按工龄增序输出每个工龄段有多少员工。

输入格式:

输入首先给出正整数N(≤105),即员工总人数;随后给出N个整数,即每个员工的工龄,范围在[0, 50]。

输出格式:

按工龄的递增顺序输出每个工龄的员工个数,格式为:“工龄:人数”。每项占一行。如果人数为0则不输出该项。

输入样例:

8
10 2 0 5 7 2 5 2

输出样例:

0:1
2:3
5:2
7:1
10:1

思路

简单的桶排序,无须赘述。

AC code

#include <stdio.h>

int read() {
    int num = 0, symbol = 1;
    char ch = getchar();
    while (ch < '0' || ch > '9') {
        if (ch == '-') symbol = -1;
        ch = getchar();
    }
    while (ch >= '0' && ch <= '9')
        num = (num << 3) + (num << 1) + (ch - '0'), ch = getchar();
    return num * symbol;
}

int main() {
    int N, num, age[51] = {0};
    N = read();
    while (N--) {
        num = read();
        ++age[num];
    }
    for (int i = 0; i < 51; i++)
        if (age[i]) printf("%d:%d\n", i, age[i]);
    return 0;
}
测试点 提示 结果 分数 耗时 内存
0 sample 有人数为0的段 答案正确 12 3 ms 196 KB
1 最大N,工龄全0 答案正确 4 5 ms 196 KB
2 最大N,按递减顺序给出 答案正确 4 5 ms 196 KB

2.PAT Judge

2014年PAT春季考试真题

The ranklist of PAT is generated from the status list, which shows the scores of the submissions. This time you are supposed to generate the ranklist for PAT.

Input Specification:

Each input file contains one test case. For each case, the first line contains 3 positive integers, $N (≤10^4)$, the total number of users, $K (≤5)$, the total number of problems, and $M (≤10^5)$, the total number of submissions. It is then assumed that the user id’s are 5-digit numbers from 00001 to N, and the problem id’s are from 1 to K. The next line contains K positive integers p[i] (i=1, ..., *K*), where p[i]` corresponds to the full mark of the i-th problem. Then M lines follow, each gives the information of a submission in the following format:

user_id problem_id partial_score_obtained

where partial_score_obtained is either −1 if the submission cannot even pass the compiler, or is an integer in the range [0, p[problem_id]]. All the numbers in a line are separated by a space.

Output Specification:

For each test case, you are supposed to output the ranklist in the following format:

rank user_id total_score s[1] ... s[K]

where rank is calculated according to the total_score, and all the users with the same total_score obtain the same rank; and s[i] is the partial score obtained for the i-th problem. If a user has never submitted a solution for a problem, then “-“ must be printed at the corresponding position. If a user has submitted several solutions to solve one problem, then the highest score will be counted.

The ranklist must be printed in non-decreasing order of the ranks. For those who have the same rank, users must be sorted in nonincreasing order according to the number of perfectly solved problems. And if there is still a tie, then they must be printed in increasing order of their id’s. For those who has never submitted any solution that can pass the compiler, or has never submitted any solution, they must NOT be shown on the ranklist. It is guaranteed that at least one user can be shown on the ranklist.

Sample Input:

7 4 20
20 25 25 30
00002 2 12
00007 4 17
00005 1 19
00007 2 25
00005 1 20
00002 2 2
00005 1 15
00001 1 18
00004 3 25
00002 2 25
00005 3 22
00006 4 -1
00001 2 18
00002 1 20
00004 1 15
00002 4 18
00001 3 4
00001 4 2
00005 2 -1
00004 2 0

Sample Output:

1 00002 63 20 25 - 18
2 00005 42 20 0 22 -
2 00007 42 - 25 - 17
2 00001 42 18 18 4 2
5 00004 40 15 0 25 -

思路

此题关键在于注意细节

  • 按照总分排序高低排序。
  • 总分相同,那么按照答对题目个数(完全正确,拿满分)进行排序。
  • 然后再相同就需要按照学号进行排序。
  • 一道题都没有提交或者提交但没有一道题通过编译的同学不输出。(提交得零分也要输出)
  • 提交没有通过编译的题目输出0,没有提交的题目输出“-”。

AC code

#include <stdio.h>
#include <stdlib.h>

#define MAXP 6
#define MAXU 10001
int N, K, M;
int Score[MAXP];

typedef struct Node {
    int id;
    int flag; //是否输出
    int total_score;
    int score[MAXP];
    int total_number;
} Student;
Student student[MAXU];

int read() {
    int num = 0, symbol = 1;
    char ch = getchar();
    while (ch < '0' || ch > '9') {
        if (ch == '-') symbol = -1;
        ch = getchar();
    }
    while (ch >= '0' && ch <= '9')
        num = (num << 3) + (num << 1) + (ch - '0'), ch = getchar();
    return num * symbol;
}

void Init(Student student[]) {
    int id, pro, score;
    for (int i = 1; i <= K; i++) Score[i] = read();
    for (int i = 1; i <= N; i++)
        for (int j = 1; j < 6; j++) student[i].score[j] = -1;
    for (int i = 0; i < M; i++) {
        id = read(), pro = read(), score = read();
        student[id].id = id;
        if (student[id].score[pro] == -1) student[id].score[pro] = 0;
        if (student[id].score[pro] <= score) {
            student[id].flag = 1;
            student[id].total_score += (score - student[id].score[pro]);
            student[id].score[pro] = score;
        }
    }
}

void Count_Number(Student studentList[]) {
    for (int i = 1; i <= N; i++) {
        if (studentList[i].total_score != 0) {
            for (int j = 1; j <= K; j++) {
                if (studentList[i].score[j] == Score[j]) {
                    studentList[i].total_number++;
                }
            }
        }
    }
}

int cmp(const void *a, const void *b) {
    if (((Student *)a)->total_score > ((Student *)b)->total_score || //分数高低
        (((Student *)a)->total_score == ((Student *)b)->total_score &&
         ((Student *)a)->total_number >
             ((Student *)b)->total_number) || //分数相同按完全答对题目个数
        (((Student *)a)->total_score == ((Student *)b)->total_score &&
         ((Student *)a)->total_number == ((Student *)b)->total_number &&
         ((Student *)a)->id < ((Student *)b)->id)) { //前两者都相同,按id
        return 0;
    }
    return 1;
}

//输出函数
void print(Student student[]) {
    int number = 1;
    for (int i = 1; i <= N; i++) {
        if (student[i].flag > 0) {
            printf("%d ", number);
            if (student[i + 1].total_score != student[i].total_score)
                number = i + 1;
            printf("%05d %d", student[i].id, student[i].total_score);
            for (int j = 1; j <= K; j++)
                if (student[i].score[j] == -1)
                    printf(" -");
                else
                    printf(" %d", student[i].score[j]);
            printf("\n");
        }
    }
}
int main() {
    scanf("%d %d %d", &N, &K, &M);
    Init(student);
    Count_Number(student);
    qsort(student + 1, N, sizeof(Student), cmp);
    print(student);
    return 0;
}
测试点 提示 结果 分数 耗时 内存
0 sample 部分分;后面提交没有前面高;并列按id排;没提交;编译不过;有人有题目没提交过;提交0分、提交-1 答案正确 13 3 ms 312 KB
1 最小N 答案正确 3 3 ms 304 KB
2 K最小,有0分 答案正确 3 3 ms 192 KB
3 M最小 答案正确 3 4 ms 708 KB
4 最大随机测试,有满分的题目重复提交 答案正确 3 18 ms 928 KB

3.Sort with Swap$(0, i)$

Given any permutation of the numbers {0, 1, 2,…, N−1}, it is easy to sort them in increasing order. But what if Swap(0, *) is the ONLY operation that is allowed to use? For example, to sort {4, 0, 2, 1, 3} we may apply the swap operations in the following way:

Swap(0, 1) => {4, 1, 2, 0, 3}
Swap(0, 3) => {4, 1, 2, 3, 0}
Swap(0, 4) => {0, 1, 2, 3, 4}

Now you are asked to find the minimum number of swaps need to sort the given permutation of the first N nonnegative integers.

Input Specification:

Each input file contains one test case, which gives a positive $N (≤10^5)$ followed by a permutation sequence of {0, 1, …, N−1}. All the numbers in a line are separated by a space.

Output Specification:

For each case, simply print in a line the minimum number of swaps need to sort the given permutation.

Sample Input:

10
3 5 7 2 6 4 9 0 8 1

Sample Output:

9

思路

题意是给定N个数字的排列,仅利用与0交换排序。

事实上,注意到 N个数字的排列由若干个独立的环组成 。

环分3种

  1. 只有1个元素:不需要交换

  2. 环里$n_0$个元素,包括0:需要$n_0–1$次交换

  3. 第$i$个环里有$n_i$个元素,不包括0:先把0换到环里,再进行$(n_i+1)–1$次交换 —— 一共是$n_i+1$次交换

若$N$个元素的序列中包含$S$个单元环、 $K$个多元环,则交换次数为:

$n_0-1+\sum^{K-1}{i=1}(n_i+1)=\sum^{K-1}{i=0}n_i\ +K-2=N-S+K-2$​

注意若$a[0]=0$,则不存在第二种情况,公式变为$N-S+K$

AC code

#include <stdio.h>

#define MAXSIZE 100000
int N, S = 0, K = 0, pos, num[MAXSIZE], flag[MAXSIZE] = {0};

int read() {
    int num = 0, symbol = 1;
    char ch = getchar();
    while (ch < '0' || ch > '9') {
        if (ch == '-') symbol = -1;
        ch = getchar();
    }
    while (ch >= '0' && ch <= '9')
        num = (num << 3) + (num << 1) + (ch - '0'), ch = getchar();
    return num * symbol;
}

int main() {
    N = read();
    for (int i = 0; i < N; i++) num[i] = read();
    for (int i = 0; i < N; i++) {
        if (num[i] == i)
            ++S, flag[i] = 1;
        else if (!flag[i]) {
            ++K;
            pos = num[i];
            flag[i] = 1;
            while (pos != i) flag[pos] = 1, pos = num[pos];
        }
    }
    if (num[0])
        printf("%d", N - S + K - 2);
    else
        printf("%d", N - S + K);
    return 0;
}
测试点 提示 结果 分数 耗时 内存
0 sample (2大环+1单元环,重复3遍) 答案正确 15 3 ms 320 KB
1 最大N,两两倒序 答案正确 3 6 ms 1084 KB
2 全倒序 答案正确 3 4 ms 572 KB
3 初始值为0 答案正确 2 4 ms 176 KB
4 最小N 答案正确 1 3 ms 192 KB
5 N=2 答案正确 1 3 ms 320 KB

文章作者: Phoenix
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 Phoenix !
  目录