Convnext architecture dev #356

Lorenzobattistela · 2023-08-28T13:46:28Z

Implementing ConvNeXt architecture referred in this paper and #353 .

This PR re implements #354 but based on the development branch to fix some test and formatting issues.

Lorenzobattistela · 2023-08-28T14:10:27Z

Test errors on CI: ImportError: cannot import name 'convnext' from 'tensorflow.keras.applications'
Maybe convnext wasnt a keras.applications module in the version of tensorflow CI is using.
(https://www.tensorflow.org/api_docs/python/tf/keras/applications/convnext)

In my local, all tests run ok. My TF version is 2.13.0

owenvallis · 2023-09-08T04:08:37Z

Hi @Lorenzobattistela, looks like convnext wasn't introduced until TF v2.10. I wonder if we can do a TF version check for this? The other option is we could provide a general architecture class that accepts a tf.keras.application as input and wraps it. I'm not sure if it would be simple to apply to all applications, but would be cleaner for the package and avoids the version issue altogether.

Lorenzobattistela · 2023-09-08T22:34:45Z

So, @owenvallis I thought about what you said. Anyway, I updated the code to do some version checking on test (to skip if tf < 2.10), and maybe we can add some version checking to inform a more useful error to the user if it tries to use it with a minor tf version.

However, I think the refactoring path to a wrapper for keras applications is the best approach. I'm willing to work on this, will start refactoring it.

It is up to you to merge or not this, maybe we can use this as a "hotfix" and then refactor to something better. Thanks for the review.

Lorenzobattistela · 2023-09-12T13:47:44Z

@owenvallis something is going wrong with isort, but i did ran it

erikreed · 2023-09-26T17:24:06Z

tensorflow_similarity/architectures/convnext.py

+        convnext.trainable = True
+        for layer in convnext.layers:
+            # freeze all layeres befor the last 3 blocks
+            if not re.search("^block[5,6,7]|^top", layer.name):


I'm also trying out this architecture. But does this EfficientNetV2 layer naming apply to convnext?

model = tf.keras.applications.ConvNeXtBase() [l.name for l in model.layers if re.search("^block[5,6,7]|^top", l.name)] # this outputs []

The test also suggests partial is not being applied as expected since the number of trainable layers is 0 with partial.

edit: another candidate might be "convnext_base_stage_3_block_2", also unfreezing the last layer norm since it comes after the final block.

model.trainable = True for layer in model.layers: # freeze all layers before the last block if not re.search("^convnext_base_stage_3_block_2", layer.name): layer.trainable = False model.layers[-1].trainable = True

This results in about 10% of weights being unfrozen and only the final block [1].

Total params: 87566464 (334.04 MB) Trainable params: 8450048 (32.23 MB) Non-trainable params: 79116416 (301.81 MB)

[1]

Lorenzobattistela added 3 commits August 28, 2023 10:42

[feat] adding convnext architecture

8b8b5a3

[feat] adding module to init

9444478

[test] convnext architecture testing

582079c

Lorenzobattistela mentioned this pull request Aug 28, 2023

Convnext achitecture #354

Closed

Lorenzobattistela added 4 commits September 8, 2023 19:25

[fix] tf version check for convnext

8ca0e5f

[fix] convnext version check

a1f10e4

[fix] remove useless version check

300a1e1

[lint]

c1ded82

Lorenzobattistela added 2 commits September 12, 2023 10:08

[lint] black on test file

2b7a553

[lint] isort

3ac3017

erikreed reviewed Sep 26, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convnext architecture dev #356

Convnext architecture dev #356

Lorenzobattistela commented Aug 28, 2023

Lorenzobattistela commented Aug 28, 2023

owenvallis commented Sep 8, 2023

Lorenzobattistela commented Sep 8, 2023

Lorenzobattistela commented Sep 12, 2023

erikreed Sep 26, 2023 •

edited

Loading

Convnext architecture dev #356

Are you sure you want to change the base?

Convnext architecture dev #356

Conversation

Lorenzobattistela commented Aug 28, 2023

Lorenzobattistela commented Aug 28, 2023

owenvallis commented Sep 8, 2023

Lorenzobattistela commented Sep 8, 2023

Lorenzobattistela commented Sep 12, 2023

erikreed Sep 26, 2023 • edited Loading

Choose a reason for hiding this comment

erikreed Sep 26, 2023 •

edited

Loading