Merge tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pul removal of obsolete architecture ports from Arnd Bergmann:
 "This removes the entire architecture code for blackfin, cris, frv,
  m32r, metag, mn10300, score, and tile, including the associated device
  drivers.

  I have been working with the (former) maintainers for each one to
  ensure that my interpretation was right and the code is definitely
  unused in mainline kernels. Many had fond memories of working on the
  respective ports to start with and getting them included in upstream,
  but also saw no point in keeping the port alive without any users.

  In the end, it seems that while the eight architectures are extremely
  different, they all suffered the same fate: There was one company in
  charge of an SoC line, a CPU microarchitecture and a software
  ecosystem, which was more costly than licensing newer off-the-shelf
  CPU cores from a third party (typically ARM, MIPS, or RISC-V). It
  seems that all the SoC product lines are still around, but have not
  used the custom CPU architectures for several years at this point. In
  contrast, CPU instruction sets that remain popular and have actively
  maintained kernel ports tend to all be used across multiple licensees.

  [ See the new nds32 port merged in the previous commit for the next
    generation of "one company in charge of an SoC line, a CPU
    microarchitecture and a software ecosystem"   - Linus ]

  The removal came out of a discussion that is now documented at
  https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
  marking any ports as deprecated but remove them all at once after I
  made sure that they are all unused. Some architectures (notably tile,
  mn10300, and blackfin) are still being shipped in products with old
  kernels, but those products will never be updated to newer kernel
  releases.

  After this series, we still have a few architectures without mainline
  gcc support:

   - unicore32 and hexagon both have very outdated gcc releases, but the
     maintainers promised to work on providing something newer. At least
     in case of hexagon, this will only be llvm, not gcc.

   - openrisc, risc-v and nds32 are still in the process of finishing
     their support or getting it added to mainline gcc in the first
     place. They all have patched gcc-7.3 ports that work to some
     degree, but complete upstream support won't happen before gcc-8.1.
     Csky posted their first kernel patch set last week, their situation
     will be similar

  [ Palmer Dabbelt points out that RISC-V support is in mainline gcc
    since gcc-7, although gcc-7.3.0 is the recommended minimum  - Linus ]"

This really says it all:

 2498 files changed, 95 insertions(+), 467668 deletions(-)

* tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits)
  MAINTAINERS: UNICORE32: Change email account
  staging: iio: remove iio-trig-bfin-timer driver
  tty: hvc: remove tile driver
  tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers
  serial: remove tile uart driver
  serial: remove m32r_sio driver
  serial: remove blackfin drivers
  serial: remove cris/etrax uart drivers
  usb: Remove Blackfin references in USB support
  usb: isp1362: remove blackfin arch glue
  usb: musb: remove blackfin port
  usb: host: remove tilegx platform glue
  pwm: remove pwm-bfin driver
  i2c: remove bfin-twi driver
  spi: remove blackfin related host drivers
  watchdog: remove bfin_wdt driver
  can: remove bfin_can driver
  mmc: remove bfin_sdh driver
  input: misc: remove blackfin rotary driver
  input: keyboard: remove bf54x driver
  ...
This commit is contained in:
Linus Torvalds
2018-04-02 20:20:12 -07:00
2498 changed files with 95 additions and 467668 deletions

View File

@@ -115,78 +115,6 @@
/* adapting coeffs using the traditional stochastic descent (N)LMS algorithm */
#ifdef __bfin__
static inline void lms_adapt_bg(struct oslec_state *ec, int clean, int shift)
{
int i;
int offset1;
int offset2;
int factor;
int exp;
int16_t *phist;
int n;
if (shift > 0)
factor = clean << shift;
else
factor = clean >> -shift;
/* Update the FIR taps */
offset2 = ec->curr_pos;
offset1 = ec->taps - offset2;
phist = &ec->fir_state_bg.history[offset2];
/* st: and en: help us locate the assembler in echo.s */
/* asm("st:"); */
n = ec->taps;
for (i = 0; i < n; i++) {
exp = *phist++ * factor;
ec->fir_taps16[1][i] += (int16_t) ((exp + (1 << 14)) >> 15);
}
/* asm("en:"); */
/* Note the asm for the inner loop above generated by Blackfin gcc
4.1.1 is pretty good (note even parallel instructions used):
R0 = W [P0++] (X);
R0 *= R2;
R0 = R0 + R3 (NS) ||
R1 = W [P1] (X) ||
nop;
R0 >>>= 15;
R0 = R0 + R1;
W [P1++] = R0;
A block based update algorithm would be much faster but the
above can't be improved on much. Every instruction saved in
the loop above is 2 MIPs/ch! The for loop above is where the
Blackfin spends most of it's time - about 17 MIPs/ch measured
with speedtest.c with 256 taps (32ms). Write-back and
Write-through cache gave about the same performance.
*/
}
/*
IDEAS for further optimisation of lms_adapt_bg():
1/ The rounding is quite costly. Could we keep as 32 bit coeffs
then make filter pluck the MS 16-bits of the coeffs when filtering?
However this would lower potential optimisation of filter, as I
think the dual-MAC architecture requires packed 16 bit coeffs.
2/ Block based update would be more efficient, as per comments above,
could use dual MAC architecture.
3/ Look for same sample Blackfin LMS code, see if we can get dual-MAC
packing.
4/ Execute the whole e/c in a block of say 20ms rather than sample
by sample. Processing a few samples every ms is inefficient.
*/
#else
static inline void lms_adapt_bg(struct oslec_state *ec, int clean, int shift)
{
int i;
@@ -215,7 +143,6 @@ static inline void lms_adapt_bg(struct oslec_state *ec, int clean, int shift)
ec->fir_taps16[1][i] += (int16_t) ((exp + (1 << 14)) >> 15);
}
}
#endif
static inline int top_bit(unsigned int bits)
{

View File

@@ -27,14 +27,6 @@
#define _FIR_H_
/*
Blackfin NOTES & IDEAS:
A simple dot product function is used to implement the filter. This performs
just one MAC/cycle which is inefficient but was easy to implement as a first
pass. The current Blackfin code also uses an unrolled form of the filter
history to avoid 0 length hardware loop issues. This is wasteful of
memory.
Ideas for improvement:
1/ Rewrite filter for dual MAC inner loop. The issue here is handling
@@ -94,21 +86,13 @@ static inline const int16_t *fir16_create(struct fir16_state_t *fir,
fir->taps = taps;
fir->curr_pos = taps - 1;
fir->coeffs = coeffs;
#if defined(__bfin__)
fir->history = kcalloc(2 * taps, sizeof(int16_t), GFP_KERNEL);
#else
fir->history = kcalloc(taps, sizeof(int16_t), GFP_KERNEL);
#endif
return fir->history;
}
static inline void fir16_flush(struct fir16_state_t *fir)
{
#if defined(__bfin__)
memset(fir->history, 0, 2 * fir->taps * sizeof(int16_t));
#else
memset(fir->history, 0, fir->taps * sizeof(int16_t));
#endif
}
static inline void fir16_free(struct fir16_state_t *fir)
@@ -116,42 +100,9 @@ static inline void fir16_free(struct fir16_state_t *fir)
kfree(fir->history);
}
#ifdef __bfin__
static inline int32_t dot_asm(short *x, short *y, int len)
{
int dot;
len--;
__asm__("I0 = %1;\n\t"
"I1 = %2;\n\t"
"A0 = 0;\n\t"
"R0.L = W[I0++] || R1.L = W[I1++];\n\t"
"LOOP dot%= LC0 = %3;\n\t"
"LOOP_BEGIN dot%=;\n\t"
"A0 += R0.L * R1.L (IS) || R0.L = W[I0++] || R1.L = W[I1++];\n\t"
"LOOP_END dot%=;\n\t"
"A0 += R0.L*R1.L (IS);\n\t"
"R0 = A0;\n\t"
"%0 = R0;\n\t"
: "=&d"(dot)
: "a"(x), "a"(y), "a"(len)
: "I0", "I1", "A1", "A0", "R0", "R1"
);
return dot;
}
#endif
static inline int16_t fir16(struct fir16_state_t *fir, int16_t sample)
{
int32_t y;
#if defined(__bfin__)
fir->history[fir->curr_pos] = sample;
fir->history[fir->curr_pos + fir->taps] = sample;
y = dot_asm((int16_t *) fir->coeffs, &fir->history[fir->curr_pos],
fir->taps);
#else
int i;
int offset1;
int offset2;
@@ -165,7 +116,6 @@ static inline int16_t fir16(struct fir16_state_t *fir, int16_t sample)
y += fir->coeffs[i] * fir->history[i - offset1];
for (; i >= 0; i--)
y += fir->coeffs[i] * fir->history[i + offset2];
#endif
if (fir->curr_pos <= 0)
fir->curr_pos = fir->taps;
fir->curr_pos--;