Saturday, July 6, 2019

In Academia, AGI, AI, Analysis, CNN, Comment, Computer Vision, Neural Networks, Papers by Todor "Tosh" Arnaudov - Twenkid // Saturday, July 06, 2019 // Leave a Comment

Shape Bias in CNN for Better Results due to Wrong Texture Bias by Default

In its intro the authors of the paper below explain that it's been a common believe in the CNN community that the ImageNet trained neural networks developed a "shape bias" and stored "shape representations", they propose a contrary view, that CNN are texture-biased and prove it with experiments:

IMAGENET-TRAINED CNNS ARE BIASED TOWARDS TEXTURE; INCREASING SHAPE BIAS IMPROVES ACCURACY AND ROBUSTNESS

To me that texture-bias has been obvious and obviously wrong. The CNNs recognise texture-features and search correlations between them, otherwise there wouldn't be adversarial hacks with changing a pixel and ruining recognition, it wouldn't need to be trained with so many examples, it would recognize wireframe drawings/sketches as humans do etc. etc.

The "right" recognition would be robust if the system can do 3D-structure-and-light reconstruction ("reverse graphics"), at best incrementally, see:

CapsNet, capsules, vision as 3D-reconstruction and re-rendering and mainstream approval of ideas and insights of Boris Kazachenko and Todor Arnaudov, Sunday, December 31, 2017

https://artificial-mind.blogspot.com/2017/12/capsnet-capsules-and-CogAlg-3D-reconstruction.html

Colour Optical Illusions are the Effect of the 3D-Reconstruction and Compensation of the Light Source Coordinates and Light Intensity in an Assumed 2D Projection of a 3D Scene, 1.1.2012

Colour Optical Illusions are the Effect of the 3D-Reconstruction and Compensation of the Light Source Coordinates and Light Intensity in an Assumed 2D Projection of a 3D Scene

2012, discussions at AGI List:
AGI Digest: Chairs, Caricatures and Object Recognition as 3D Reconstruction

http://research.twenkid.com/agi/2012/AGI_2012_Chairs_Caricatures_and_Object_Recognition_as_3D_Reconstruction.pdf

Developmental Approach to Machine Learning, Dec 2018

https://artificial-mind.blogspot.com/2018/12/developmental-approach-to-machine.html

News: Mathematics, Rendering, Art, Drawing, Painting, Visual, Generalizing, Music, Analyzing, Tuesday, September 25, 2012

[Topology, Vector Transformations, Adjacency/Connectedness...]

https://artificial-mind.blogspot.com/2012/09/news-mathematics-rendering-art-drawing.html

"...Vector transformations

In another "unpublished paper" from a few months ago, which would turn into a digest one day eventually (it's a published email discussion), I explained and shared some elegant fundamental AGI operations/generalizations which are based on simple visual 3D transformations.

"Everything" is a bunch of vector transformations and the core of the general intelligence are the simplest representations of those "visual" representations, which are really simple/basic/general.

And "visual" in human terms actually means just:

Something that encompasses features and operations in 1D, 2D, 3D and 4D (video) vector (Euclidian) spaces, and the vectors in these dimensions can be of dimensionality usually of up to 4 or 5, such as: //e.g. (Luma, R,G,B)

1D - luminance
2D - luminance + uniform 1D color space
3D/4D - luminance + splitted/"component" color space

+ Perspective projection, which is a vector transform, it can be represented as a multiplication of matrices - that is - the initial sources of visual data are of higher dimensionality than the stored representation, 3D is projected into 2D (a drawback of the way of sensing)/

Also, of course, there is topology, humans work fine with blended and deformed images - curved spaces, and curves, not simple linear vectors. However the topology is induced from the basic vector spaces, the simplest topological representation is just the adjacency of coordinates in a matrix.

The above may seem obvious, but the goal is namely to make things as explicit as possible...."

Sunday, April 1, 2012

Jürgen Schmidhuber Talk on Ted, Creative Machines and the Omega Point | Computer Vision, 3D-reconstruction from 2D

https://artificial-mind.blogspot.com/2012/04/jurgen-schmidhuber-talk-on-ted-creative.html
"Todor: And it takes many months to get to 3D-vision and to increase resolution and develop 3D-reconstruction in the human brain. That adds ~86400 fold per day and 31,536,000 "cycles" per year.
What computing power is needed?

I don't think you need millions of the most powerful GPUs and CPUs at the moment to beat human vision, we'll beat it pretty soon, a lot of the higher level intelligence in my estimation is very low at its complexity (behavior, decision making, language at the grammar/vocabulary levels) and would need a tiny amount of MIPS, FLOPS and memory. It's the lowest levels which require vast computing power - 3D-reconstruction from 2D one or many static or motion camera sources, transformations, rotations, trajectories computations etc., and those problems are practically being solved and implemented...."

Saturday, July 6, 2019

Shape Bias in CNN for Better Results due to Wrong Texture Bias by Default

Sunday, April 1, 2012

0 коментара:

Search

About Me

Links

Contact me

Blog Archive

Visitors

Popular Posts

Labels

Email Newsletter

Followers

Labels

Popular Posts

Labels

Histats Counter

New Visitors