python - Count unique elements row wise in an ndarray

Question

Welcome To Ask or Share your Answers For Others

python - Count unique elements row wise in an ndarray

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Count unique elements row wise in an ndarray

An extension to this question. In addition to having the unique elements row-wise, I want to have a similarly shaped array that gives me the count of unique values. For example, if the initial array looks like this:

a = np.array([[1,  2, 2, 3,  4, 5],
              [1,  2, 3, 3,  4, 5],
              [1,  2, 3, 4,  4, 5],
              [1,  2, 3, 4,  5, 5],
              [1,  2, 3, 4,  5, 6]])

I would like to get this as the output from the function:

np.array([[1,  2, 0, 1,  1, 1],
          [1,  1, 2, 0,  1, 1],
          [1,  1, 1, 2,  0, 1],
          [1,  1, 1, 1,  2, 0],
          [1,  1, 1, 1,  1, 1]])

In numpy v.1.9 there seems to be an additional argument return_counts that can return the counts in a flattened array. Is there some way this can be re-constructed into the original array dimensions with zeros where values were duplicated?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T20:04:53+0000

The idea behind this answer is very similar to the one used here. I'm adding a unique imaginary number to each row. Therefore, no two numbers from different rows can be equal. Thus, you can find all the unique values in a 2D array per row with just one call to np.unique.

The index, ind, returned when return_index=True gives you the location of the first occurrence of each unique value.

The count, cnt, returned when return_counts=True gives you the count.

np.put(b, ind, cnt) places the count in the location of the first occurence of each unique value.

One obvious limitation of the trick used here is that the original array must have int or float dtype. It can not have a complex dtype to start with, since multiplying each row by a unique imaginary number may produce duplicate pairs from different rows.

import numpy as np

a = np.array([[1,  2, 2, 3,  4, 5],
              [1,  2, 3, 3,  4, 5],
              [1,  2, 3, 4,  4, 5],
              [1,  2, 3, 4,  5, 5],
              [1,  2, 3, 4,  5, 6]])

def count_unique_by_row(a):
    weight = 1j*np.linspace(0, a.shape[1], a.shape[0], endpoint=False)
    b = a + weight[:, np.newaxis]
    u, ind, cnt = np.unique(b, return_index=True, return_counts=True)
    b = np.zeros_like(a)
    np.put(b, ind, cnt)
    return b

yields

In [79]: count_unique_by_row(a)
Out[79]: 
array([[1, 2, 0, 1, 1, 1],
       [1, 1, 2, 0, 1, 1],
       [1, 1, 1, 2, 0, 1],
       [1, 1, 1, 1, 2, 0],
       [1, 1, 1, 1, 1, 1]])

Categories

python - Count unique elements row wise in an ndarray

python - Count unique elements row wise in an ndarray

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags