이미지 처리의 핵심 포인트는 무엇입니까?
예를 들어 OpenCV를 사용할 때 SIFT 또는 SURF와 같은 알고리즘이 키포인트를 감지하는 데 자주 사용됩니다. 내 질문은 실제로 이러한 핵심 포인트는 무엇입니까?
나는 그들이 이미지에서 일종의 "관심 지점"이라는 것을 이해합니다. 나는 또한 그것들이 규모가 변하지 않고 원형이라는 것을 알고 있습니다.
또한 오리엔테이션이 있다는 것을 알았지 만 이것이 실제로 무엇인지 이해할 수 없었습니다. 각도이지만 반경과 무언가 사이입니까? 설명 좀 해주시 겠어요? 먼저 필요한 것이 더 간단하고 그 후에는 논문을 이해하는 것이 더 쉬울 것이라고 생각합니다.
아주 좋은 질문입니다. 각 요점을 하나씩 다룹니다.
내 질문은 실제로 이러한 핵심 포인트는 무엇입니까?
키포인트는 관심 포인트와 동일합니다. 그것들은 공간적 위치 또는 이미지에서 흥미로운 부분 이나 이미지에서 두드러지는 부분 을 정의하는 지점입니다 . 관심 지점 감지는 실제로 이미지에서 관심 영역이나 공간 영역을 찾는 것을 목표로하는 얼룩 감지 의 하위 집합입니다 . 키포인트가 특별한 이유는 이미지가 어떻게 변경 되든 ... 이미지가 회전, 축소 / 확장, 변환 (이것들은 모두 아핀 변환 이 될 것입니다 ...)하거나 왜곡되기 때문입니다 ( 즉, 투영 변환 또는 동질화 ), 동일한 것을 찾을 수 있어야합니다.원본 이미지와 비교할 때이 수정 된 이미지의 키포인트. 다음은 내가 얼마 전에 작성한 게시물의 예입니다.
출처 : module '객체에'drawMatches '속성이 없습니다 .opencv python
오른쪽 이미지는 왼쪽 이미지의 회전 버전입니다. 또한 두 이미지 사이의 상위 10 개 일치 항목 만 표시했습니다. 상위 10 개 경기를 살펴보면 이미지의 내용을 기억할 수 있도록 초점을 맞추고 싶을 것입니다. 카메라맨의 얼굴뿐만 아니라 카메라, 삼각대 및 배경에있는 건물의 흥미로운 텍스처에 초점을 맞추고 싶습니다. 두 이미지 사이에서 이러한 동일한 점이 발견 되었으며 성공적으로 일치되었음을 알 수 있습니다.
Therefore, what you should take away from this is that these are points in the image that are interesting and that they should be found no matter how the image is distorted.
I understand that they are some kind of "points of interest" of an image. I also know that they are scale invariant and I know they are circular.
You are correct. Scale invariant means that no matter how you scale the image, you should still be able to find those points.
Now we are going to venture into the descriptor part. What makes keypoints different between frameworks is the way you describe these keypoints. These are what are known as descriptors. Each keypoint that you detect has an associated descriptor that accompanies it. Some frameworks only do a keypoint detection, while other frameworks are simply a description framework and they don't detect the points. There are also some that do both - they detect and describe the keypoints. SIFT and SURF are examples of frameworks that both detect and describe the keypoints.
Descriptors are primarily concerned with both the scale and the orientation of the keypoint. The keypoints we've nailed that concept down, but we need the descriptor part if it is our purpose to try and match between keypoints in different images. Now, what you mean by "circular"... that correlates with the scale that the point was detected at. Take for example this image that is taken from the VLFeat Toolbox tutorial:
You see that any points that are yellow are interest points, but some of these points have a different circle radius. These deal with scale. How interest points work in a general sense is that we decompose the image into multiple scales. We check for interest points at each scale, and we combine all of these interest points together to create the final output. The larger the "circle", the larger the scale was that the point was detected at. Also, there is a line that radiates from the centre of the circle to the edge. This is the orientation of the keypoint, which we will cover next.
Also I found out that they have orientation but I couldn't understand what actually it is. It is an angle but between the radius and something?
Basically if you want to detect keypoints regardless of scale and orientation, when they talk about orientation of keypoints, what they really mean is that they search a pixel neighbourhood that surrounds the keypoint and figure out how this pixel neighbourhood is oriented or what direction this patch is oriented in. It depends on what descriptor framework you look at, but the general jist is to detect the most dominant orientation of the gradient angles in the patch. This is important for matching so that you can match keypoints together. Take a look at the first figure I have with the two cameramen - one rotated while the other isn't. If you take a look at some of those points, how do we figure out how one point matches with another? We can easily identify that the top of the cameraman as an interest point matches with the rotated version because we take a look at points that surround the keypoint and see what orientation all of these points are in... and from there, that's how the orientation is computed.
Usually when we want to detect keypoints, we just take a look at the locations. However, if you want to match keypoints between images, then you definitely need the scale and the orientation to facilitate this.
Hope this helps!
I'm not as familiar with SURF, but I can tell you about SIFT, which SURF is based on. I provided a few notes about SURF at the end, but I don't know all the details.
SIFT aims to find highly-distinctive locations (or keypoints) in an image. The locations are not merely 2D locations on the image, but locations in the image's scale space, meaning they have three coordinates: x, y, and scale. The process for finding SIFT keypoints is:
- blur and resample the image with different blur widths and sampling rates to create a scale space
- use the difference of gaussians method to detect blobs at different scales; the blob centers become our keypoints at a given x, y, and scale
- assign every keypoint an orientation by calculating a histogram of gradient orientations for every pixel in its neighbourhood and picking the orientation bin with the highest number of counts
- assign every keypoint a 128-dimensional feature vector based on the gradient orientations of pixels in 16 local neighbourhoods
Step 2 gives us scale invariance, step 3 gives us rotation invariance, and step 4 gives us a "fingerprint" of sorts that can be used to identify the keypoint. Together they can be used to match occurrences of the same feature at any orientation and scale in multiple images.
SURF aims to accomplish the same goals as SIFT but uses some clever tricks in order to increase speed.
For blob detection it uses the determinant of Hessian method. The dominant orientation is found by examining the horizontal and vertical responses to harr wavelets. The feature descriptor is similar to SIFT, looking at orientations of pixels in 16 local neighbourhoods, but results in a 64-dimensional vector.
SURF features can be calculated up to 3 times faster than SIFT features, yet are just as robust in most situations.
For reference:
참고URL : https://stackoverflow.com/questions/29133085/what-are-keypoints-in-image-processing
'code' 카테고리의 다른 글
Visual Studio에서 폴더 열기 (0) | 2020.11.24 |
---|---|
C ++-unistd.h 포함 : 왜 cunistd가 아닌가? (0) | 2020.11.24 |
JavaScript에서 LINQ SelectMany ()와 동등한 작업을 수행하는 방법 (0) | 2020.11.24 |
ImportError : wsgi와 함께 django를 사용할 때 'mofin.settings'설정을 가져올 수 없습니다. (0) | 2020.11.24 |
자바 스크립트 캡처 브라우저 바로 가기 (ctrl + t / n / w) (0) | 2020.11.24 |