The input data of photometric stereo have three dimensions: one photometric dimension of different lights and two spatial dimensions, i.e., rows and columns in the image coordinate. Recent deep-learning-based photometric… Click to show full abstract
The input data of photometric stereo have three dimensions: one photometric dimension of different lights and two spatial dimensions, i.e., rows and columns in the image coordinate. Recent deep-learning-based photometric stereo algorithms usually use 2-D/3-D convolutions to process the input with three dimensions. Assumptions that violate the natural characters of photometric stereo problem, e.g., spatial pixel interdependence and light permutation invariance, have to be made due to the dimension mismatch. In this article, we propose a self-attention photometric stereo network (SPS-Net), which can exploit the information in all three dimensions without violating these natural characters. In SPS-Net, the spatial information is extracted by convolutional layers and the photometric information is aggregated by the proposed photometric fusion blocks based on the self-attention mechanism. Extensive experiments on both synthetic and real-world data sets are conducted. The proposed SPS-Net achieved higher performance than the state-of-the-art algorithms photometric stereo task with dense lightings. Without any changes, the proposed algorithm also outperformed the benchmarks in sparse and light-information-robust photometric stereo tasks.
               
Click one of the above tabs to view related content.