top of page

Search

LRCN - Long Term Recurrent Convolutional Network

May 17, 20221 min read

by Shashank Pandey

There have been a lot of attempts to combine CNN and RNN for image based sequence recognition or video classification tasks. LRCN was proposed by Jeff Donhue in 2016. It is a combination of both RNN and CNN, end-to-end trainable and suitable for large-scale visual understanding tasks such as video description, activity recognition and image captioning. If we talk about the working of LRCN then, it works by passing each visual input (an image) through a feature transformation with parameters, usually a CNN, to produce a fixed-length vector representation.

The outputs are then passed into a recurrent sequence learning module. The ease with which these tools can be incorporated into existing visual recognition pipelines makes them a natural choice for perceptual problems with time-varying visual input or sequential outputs, which these methods are able to handle with little input preprocessing and no hand-designed feature.

Implementation Code:-

def LRCN(self):

model = Sequential()

model.add(TimeDistributed(Convolution2D(32, (7,7), strides=(2, 2),

padding=’same’, activation=’relu’), input_shape=self.input_shape))

model.add(TimeDistributed(Convolution2D(32, (3,3),

kernel_initializer=”he_normal”, activation=’relu’)))

model.add(TimeDistributed(MaxPooling2D((2, 2), strides=(2, 2))))

model.add(TimeDistributed(Convolution2D(64, (3,3),

padding=’same’, activation=’relu’)))

model.add(TimeDistributed(Convolution2D(64, (3,3),

padding=’same’, activation=’relu’)))

model.add(TimeDistributed(MaxPooling2D((2, 2), strides=(2, 2))))

model.add(TimeDistributed(Convolution2D(128, (3,3),

padding=’same’, activation=’relu’)))

model.add(TimeDistributed(Convolution2D(128, (3,3),

padding=’same’, activation=’relu’)))

model.add(TimeDistributed(MaxPooling2D((2, 2), strides=(2, 2))))

model.add(TimeDistributed(Convolution2D(256, (3,3),

padding=’same’, activation=’relu’)))

model.add(TimeDistributed(Convolution2D(256, (3,3),

padding=’same’, activation=’relu’)))

model.add(TimeDistributed(MaxPooling2D((2, 2), strides=(2, 2))))

model.add(TimeDistributed(Flatten()))

model.add(Dropout(0.7))

model.add(LSTM(512, return_sequences=False, dropout=0.5))

model.add(Dense(self.nb_classes, activation=’softmax’))

return model

References

1. https://kobiso.github.io/research/research-lrcn/https://www.google.com/url?sa=t&source=web&rct=j&url=http://cseweb.ucsd.edu/classes/wi19/cse291-g/student_presentations/Image_Caption_LRCN.pdf&ved=2ahUKEwiuiMCnrbv3AhUhzzgGHV2pAoEQFnoECBkQBg&usg=AOvVaw0-yboWfuvaowGbD7l6y7pu

2. https://researchcode.com/code/1669464072/long-term-recurrent-convolutional-networks-for-visual-recognition-and-description/

Recent Posts

Quantum AI Integration: A Future Beyond Imagination

The Power of Artificial Intelligence: Revolutionizing Human Resources

Blockchain Technology: The Underpinning Technology

Comentarios

bottom of page