Contents
Objective :
We present in this project a deep learning approach that is trained on unpaired data, namely a set of photographs that exhibits a retouching style the user likes, which is much easier to collect. Our system is formulated using deep convolutional neural networks that learn to apply different retouching operations on an input image. Network training with respect to various types of edits is enabled by modeling these retouching operations in a unified manner as resolution-independent differentiable filters. To apply the filters in a proper sequence and with suitable parameters, we employ a deep reinforcement learning approach that learns to make decisions on what action to take next, given the current state of the image. In contrast to many deep learning systems, ours provides users with an understandable solution in the form of conventional retouching edits, rather than just a “black-box” result. Through quantitative comparisons and user studies, we show that this technique generates retouching results consistent with the provided photo set.
Pre-processing
We used the MIT-Adobe fiveK dataset.
- 5,000 photos in DNG format
- This format can be read by DCRAW, Adobe Lightroom, and many other tools. An Adobe Lightroom catalog with renditions by 5 experts
- This includes values of individual sliders and the full history of adjustments for each photo.
- Semantic information about each photo Indoor vs. outdoor, time of day, type of light, and main subject (people, nature, man-made objects, etc)
Data augmentation and downsampling for image features is done as the original dataset is 50GB in size and requires Adobe lightroom to pre-process RAW files. The code can be found in fetch_fiveK.py
Model
Our differentiable model consists of image features. We define a parent filter class and each of our features will inherit these parent class.
class Filter:
def __init__(self, net, cfg):
self.cfg = cfg
self.height, self.width, self.channels = list(map(int, net.get_shape()[1:]))
# Specified in child classes
self.num_filter_parameters = None
self.short_name = None
self.filter_parameters = None
def get_short_name(self):
assert self.short_name
return self.short_name
def get_num_filter_parameters(self):
assert self.num_filter_parameters
return self.num_filter_parameters
def extract_parameters(self, features):
output_dim = self.get_num_filter_parameters(
) + self.get_num_mask_parameters()
features = ly.fully_connected(
features,
self.cfg.fc1_size,
scope='fc1',
activation_fn=lrelu,
weights_initializer=tf.contrib.layers.xavier_initializer())
features = ly.fully_connected(
features,
output_dim,
scope='fc2',
activation_fn=None,
weights_initializer=tf.contrib.layers.xavier_initializer())
return features[:, :self.get_num_filter_parameters()], \
features[:, self.get_num_filter_parameters():]
# Should be implemented in child classes
def filter_param_regressor(self, features):
assert False
# Process the whole image, without masking
# Should be implemented in child classes
def process(self, img, param):
assert False
def debug_info_batched(self):
return False
def no_high_res(self):
return False
GAN architecture
Both the generator and the discriminator are neural networks. The generator output is connected directly to the discriminator input. Through backpropagation, the discriminator’s classification provides a signal that the generator uses to update its weights.
We used WGAN here. Also used Monte-carlo based Reinforcement Learning approach.
the model picks a image operation, estimate the parameters from the Retouched images provided in the dataset.
Output:
Conclusion
This method provides following significant advantages over other peers:
- Resolution independant features
- Human understandable result
- Unpaired training
Limitations :
- RL/GAN stability
- Hyperparameter Tuning
- poor performance on faces
This Was My Final Year Project Done with Jayesh
The code can be found here.