如何在另一个数组中找到一个未排序数组的不同元素索引?

How to find distinct indices of elements of one unsorted array in another one?

本文关键字：数组排序一个索引元素另一个更新时间：2023-10-16

有 2 个字符数组。两个数组的大小相同，并且彼此混乱。

例如：

char a[] = {'a', 'b', 'a', 'b', 'c', 'a', 'b', 'a', 'b', 'c' };
char b[] = {'a', 'a', 'b', 'b', 'c', 'c', 'b', 'b', 'a', 'a' };

我想在数组a中找到数组b元素的不同位置(基于 1 的索引)，在本例中为：1、3、2、4、5、10、7、9、6、8。

我实现了以下暴力方法，即O(n²)：

for (int i = 0; i < n; i++)
{
for (int j = 0; j < n; j++)
{
if (b[i] == a[j])
{
cout << j + 1 << " ";
a[j] = '';
break;
}
}
}

C++有什么方法可以将这个时间复杂度降低到O(n * log(n))甚至更低？

它可以减少到 O(n) 时间，但代价是 O(n) 内存。

您应该创建二维数组。它的第一维将由所有字符组成(总共 256=2^8 个索引，因为 sizeof(char)=1 字节)，第二维将超过 u 数组的 n 个元素。

所以，如果你有

char a[n] = ...;
char b[n] = ...;

您应该分配

int c[256][n]; // O(n) memory
int s[256]; // O(1) memory
int e[256]; // O(1) memory

并用零填充它们。您将能够使用 e[i] 作为字符数的计数器，其中代码为 i。在 c[i][0]， c[i][1]， ...您可以将字符女巫代码I的实际位置存储在数组A中。

第一步是遍历数组a，每次

将字符的位置写入 c[a[i]][m] = i，其中 m = e[a[i]]
增加 e[a[i]]

您可以使用数组 s 来存储已打印的符号位置数(s[j] 是带有代码j的打印字符数)。第二步是遍历b，每次

输出 C[B[i]][m]，其中 m = s[b[i]]
增加 s[b[i]]

每个步骤消耗 O(n) 时间，因此总时间复杂度为 O(n)。重要的是要注意，这种复杂性并不是使用随机方法(如哈希表，您必须考虑命中概率)的情况。在最坏的情况下，这种复杂性将是相同的。

您可以简单地将索引插入到由元素值键控的多重映射中，然后迭代另一个数组，找到您需要的第一个索引，然后将其从映射中删除。应为 O(n log n)：

std::multimap<char, int> charmap;
for (unsigned i = 0; i < sizeof a; i++) {
charmap.emplace(a[i], i);
}

for (char c : b) {
auto it = charmap.find(c);
std::cout << it->second + 1 << " ";
charmap.erase(it);
}

您可以有一个哈希表，在每个存储桶中都有一个列表。该列表将包含字符出现的 b[] 的索引。

取 b[] 中的每个元素，比如 b[i]，在哈希表中找到索引等于 b[i] 的列表(在您的情况下将是"a"或"b"或"c")。如果没有列表，请创建值为 i+1 的列表。如果找到列表，请将 i+1 添加到列表末尾

在您的示例中，插入完成后

"a"将是列表 1,2,9,10
"b"将是列表 3,4,7,8
"c"将是列表 5,6

处理条目时，使用列表中的第一个元素并从列表中删除第一个元素。在上面的例子中，

处理第一个条目"a"后 - 您将给出 1，并从索引"a"处的列表中删除 1。现在

"a"将是列表 2,9,10
"b"将是列表 3,4,7,8
"c"将是列表 5,6

处理第二个条目"b"后 - 您将给出 3 并从索引"b"处的列表中删除 3。现在

"a"将是列表 2,9,10
"b"将是列表 4,7,8
"c"将是列表 5,6

这将是O(N)？(假设你有双向链表)

编辑我更喜欢@user2079303的答案而不是我的答案。相同的概念，更容易实现。

反向查找的具体实现。您将看到前两个循环是开销。unordered_map基本上是另一个答案中建议的"哈希表"。

auto it是一个迭代器，其中first是索引，second是值(在我们的例子中是queue)。令人困惑的是，作为反向查找，"索引"也是您认为的值(反之亦然，队列中的每个"值"都是a的索引。

我使用队列是因为您只想使用每个索引一次(因此pop等效于将值设置为)。

不过，我认为理解它的最佳方法是在调试器中逐步完成它。

#include <unordered_map>
#include <queue>
#include <memory>
#include <iostream>
using namespace std;
int main()
{
char a[] = { 'a', 'b', 'a', 'b', 'c', 'a', 'b', 'a', 'b', 'c' };
char b[] = { 'a', 'a', 'b', 'b', 'c', 'c', 'b', 'b', 'a', 'a' };
unordered_map<char, shared_ptr<queue<int>>> reverseIndex;
//create a queue for each unique char
for (int i = 0; i < 10; ++i)
{
auto it = reverseIndex.find(a[i]);
if (it == reverseIndex.end())
{
reverseIndex.emplace(a[i], make_shared<queue<int>>());
}
}
//put the indexes as *values* into our unordered_map vectors
for (int i = 0; i < 10; ++i)
{
auto it = reverseIndex.find(a[i]);
it->second->push(i);
}
//perform the actual work
for (int i = 0; i < 10; ++i)
{
auto it = reverseIndex.find(b[i]);
cout << it->second->front()+1 << "n";
//you'll need to push the value back onto the queue for multiple uses of reverseIndex:  it->second->push(it->second->front());
it->second->pop();
}
return 0;
}