컬렉션에 추가 한 다음 정렬하거나 정렬 된 컬렉션에 추가하는 것이 더 빠릅니까?

code

컬렉션에 추가 한 다음 정렬하거나 정렬 된 컬렉션에 추가하는 것이 더 빠릅니까?

codestyles 2020. 10. 27. 08:15

컬렉션에 추가 한 다음 정렬하거나 정렬 된 컬렉션에 추가하는 것이 더 빠릅니까?

다음 Map과 같은 경우 :

HashMap<Integer, ComparableObject> map;

자연 순서를 사용하여 정렬 된 값 모음을 얻고 싶습니다. 어떤 방법이 가장 빠릅니까?

(ㅏ)

와 같은 정렬 가능한 컬렉션의 인스턴스를 ArrayList만들고 값을 추가 한 다음 정렬합니다.

List<ComparableObject> sortedCollection = new ArrayList<ComparableObject>(map.values());
Collections.sort(sortedCollection);

(비)

와 같이 정렬 된 컬렉션의 인스턴스를 TreeSet만든 다음 값을 추가합니다.

Set<ComparableObject> sortedCollection = new TreeSet<ComparableObject>(map.values());

결과 컬렉션은 수정되지 않으므로 정렬은 한 번만 수행하면됩니다.

TreeSet은 메서드에 log(n)대한 시간 복잡도를 보장 add()/remove()/contains()합니다. 정렬 ArrayList은 n*log(n)작업을 수행하지만 add()/get()작업 만 1수행합니다.

따라서 주로 검색하고 자주 정렬하지 않는 ArrayList경우 더 나은 선택입니다. 자주 정렬하지만 그렇게 많이 검색하지 않으면 TreeSet더 나은 선택이 될 것입니다.

이론적으로는 마지막 정렬이 더 빨라야합니다. 프로세스를 통해 정렬 된 상태를 유지하려면 CPU 시간이 추가로 필요할 수 있습니다.

CS 관점에서 두 작업 모두 NlogN이지만 1 개의 정렬은 더 낮은 상수를 가져야합니다.

두 세계의 장점을 모두 사용하지 않으시겠습니까? 다시 사용하지 않는 경우 TreeSet을 사용하여 정렬하고 내용으로 ArrayList를 초기화하십시오.

List<ComparableObject> sortedCollection = 
    new ArrayList<ComparableObject>( 
          new TreeSet<ComparableObject>(map.values()));

편집하다:

나는 세 가지 접근 방식 (ArrayList + Collections.sort, TreeSet 및 두 세계의 최선의 접근 방식)을 테스트하기 위해 벤치 마크를 만들었으며 ( pastebin.com/5pyPMJav 에서 액세스 할 수 있습니다 ) 내 것이 항상 승리합니다. 테스트 파일은 값이 의도적으로 끔찍한 비교기를 갖는 10000 개의 요소로 맵을 생성 한 다음 세 가지 전략 각각이 a) 데이터를 정렬하고 b) 반복 할 기회를 얻습니다. 다음은 샘플 출력입니다 (직접 테스트 할 수 있음).

편집 : Thingy.compareTo (Thingy)에 대한 호출을 기록하는 측면을 추가했으며 이전 솔루션 (적어도 정렬)보다 훨씬 빠른 PriorityQueues를 기반으로하는 새로운 전략도 추가했습니다.

compareTo() calls:123490
Transformer ArrayListTransformer
    Creation: 255885873 ns (0.255885873 seconds) 
    Iteration: 2582591 ns (0.002582591 seconds) 
    Item count: 10000

compareTo() calls:121665
Transformer TreeSetTransformer
    Creation: 199893004 ns (0.199893004 seconds) 
    Iteration: 4848242 ns (0.004848242 seconds) 
    Item count: 10000

compareTo() calls:121665
Transformer BestOfBothWorldsTransformer
    Creation: 216952504 ns (0.216952504 seconds) 
    Iteration: 1604604 ns (0.001604604 seconds) 
    Item count: 10000

compareTo() calls:18819
Transformer PriorityQueueTransformer
    Creation: 35119198 ns (0.035119198 seconds) 
    Iteration: 2803639 ns (0.002803639 seconds) 
    Item count: 10000

이상하게도 내 접근 방식은 반복에서 가장 잘 수행됩니다 (반복에서 ArrayList 접근 방식에 차이가 없다고 생각했을 것입니다. 벤치 마크에 버그가 있습니까?)

면책 조항 : 이것이 아마도 끔찍한 벤치 마크라는 것을 알고 있지만, 당신에게 요점을 전달하는 데 도움이되며, 제 접근 방식이이기도록 조작하지 않았습니다.

(코드는 equals / hashcode / compareTo 빌더에 대한 아파치 커먼즈 / lang에 의존하지만 리팩토링하기 쉽습니다)

B)를 구현하기로 선택한 경우 하단의 TreeSet에 대한 내 의견을 읽으십시오.

앱이 가끔 정렬을 수행하지만이를 많이 반복하는 경우 간단한 정렬되지 않은 목록을 사용하는 것이 가장 좋습니다. 한 번만 정렬하면 더 빠른 반복의 이점을 얻을 수 있습니다. 반복은 배열 목록에서 특히 빠릅니다.

However if you want sort order to be guaranteed all of the time or you are possibly adding / removing elements frequently then use a sorted collection and take the hit on iteration.

So in your case I would say A) is the better option. The list is sorted once, doesn't change and therefore benefits from being an array. Iteration should be very fast, especially if you know its an ArrayList and can directly use the ArrayList.get() instead of an Iterator.

I'd also add that TreeSet by definition is a Set which means objects are unique. A TreeSet determines equality by using compareTo on your Comparator / Comparable. You could easily find yourself missing data if you try to add two objects whose compareTo returns a value of 0. e.g. adding "C", "A", "B", "A" to a TreeSet will return "A", "B", "C"

Collections.sort uses mergeSort which has O(nlog n).

TreeSet has Red-Black tree underlying, basic operations has O(logn). Hence n elements has also O(nlog n).

So both are same big O algorithm.

Inserting in a SortedSet is O(log(n)) (BUT! the current n and not the final n). Inserting in a List is 1.

Sorting in a SortedSet is already included in inserting, so it is 0. Sorting in a List is O(n*log(n)).

So SortedSet total complexity is O(n * k), k < log(n) for all cases but the last. Instead, List total complexity is O(n * log(n) + n), so O(n * log(n)).

So, SortedSet mathematically has the best performance. But in the end, you have a Set instead of a List (because SortedList doesn't exist) and Set provides you fewer features than List. So in my opinion, the best solution for available features and performance is the one proposed by Sean Patrick Floyd:

use a SortedSet for inserting,
put the SortedSet as a parameter for creating a List to return.

참고URL : https://stackoverflow.com/questions/3607593/is-it-faster-to-add-to-a-collection-then-sort-it-or-add-to-a-sorted-collection

'code' 카테고리의 다른 글

SQL Server (localdb) \ v11.0 설명 (0)	2020.10.27
SQL Server 프로파일 러와 동등한 PostgreSQL이 있습니까? (0)	2020.10.27
C #에서 명령 줄 인수 이스케이프 (0)	2020.10.27
Clojure 1.4에서 require 내에서 refer의 사용은 무엇입니까? (0)	2020.10.27
IntelliJ IDEA에서 '기본 gradle 래퍼'의 버전을 변경하는 방법은 무엇입니까? (0)	2020.10.27

현재글컬렉션에 추가 한 다음 정렬하거나 정렬 된 컬렉션에 추가하는 것이 더 빠릅니까?

codestyle

컬렉션에 추가 한 다음 정렬하거나 정렬 된 컬렉션에 추가하는 것이 더 빠릅니까?

컬렉션에 추가 한 다음 정렬하거나 정렬 된 컬렉션에 추가하는 것이 더 빠릅니까?

(ㅏ)

(비)

'code' 카테고리의 다른 글

'code'의 다른글

티스토리툴바

컬렉션에 추가 한 다음 정렬하거나 정렬 된 컬렉션에 추가하는 것이 더 빠릅니까?

컬렉션에 추가 한 다음 정렬하거나 정렬 된 컬렉션에 추가하는 것이 더 빠릅니까?

(ㅏ)

(비)

'code' 카테고리의 다른 글

'code'의 다른글

관련글

티스토리툴바